Large Datasets in a Unit Test

I spent my whole day essentially working on one unit test. 

Yes, just one unit test. But let me explain.

This part of the app just does something very simple — serializes the results of a database query into XML.  Of course, it’s not “simple” but we have tools to make this simple.  You take some Spring and you mix in a little JiBX and they do the work.  You just have to use them.

To test this, I mocked up a small dataset and used that to get the XML.  I could manually check those values and it was fine.  But the query won’t return a small amount of data — it will be large.  At least a couple hundred items.

You could say, “Well, if it works on a small set, then it will work fine on a large one.”  And, really, you are probably right.  But how do you know?  And how do you make sure?  Since we are usually going to have a few hundred items to serialize, shouldn’t we put that into our tests?  Why, yes.

My plan was thus: have my desired response in an XML file to load up and some mock data to load into the test.  Getting the base was easy — I just took a snapshot of some query results and made that into the XML file. I would then deserialize the XML file into a DOM that should be the same DOM as the mock data (with JiBX, it  becomes trivial to marshal and unmarshal objects into XML).

But how to get the XML object?  I started by writing a Python script to read the XML and put that into a snippet of Java. To the people who don’t use scripting and/or dynamic languages this may seem strange.  To me, this is natural — Python is much better at handling text and (IMHO) better at dealing with XML. 

As I studied the resulting Java, I knew that I didn’t want to put that into a Java class.  It was too hard to change!  I didn’t want to run the script, put it into the Java class, massage it so it compiles, and then test.  So I thought a while and then remembered I could to all this in Spring!  I could take the XML and simply put into a Spring config, populating my data. Then, in my unit test, I could called the applicationContext
to get the object.

So I changed my Python script create an Spring config bean instead of a Java snippet. I put it into it’s own config, that I only call in my unit test. I named it mockdata.xml.

And, you know, it worked like a charm.

This shouldn’t have taken me all day, but it did. I spent too much time in the Java snippet instead of letting Spring to all the work for me. But, with the Python script, I have a re-usable solution so when the data needs to change, I can easily re-do my tests.

Testing should not only be fun, but correct as well.

Powered by ScribeFire.

Leave a Reply

You must be logged in to post a comment.