Large Datasets in a Unit Test
I spent my whole day essentially working on one unit test.
Yes, just one unit test. But let me explain.
This part of the app just does something very simple — serializes the results of a database query into XML. Of course, it’s not “simple” but we have tools to make this simple. You take some Spring and you mix in a little JiBX and they do the work. You just have to use them.
To test this, I mocked up a small dataset and used that to get the XML. I could manually check those values and it was fine. But the query won’t return a small amount of data — it will be large. At least a couple hundred items.
You could say, “Well, if it works on a small set, then it will work fine on a large one.” And, really, you are probably right. But how do you know? And how do you make sure? Since we are usually going to have a few hundred items to serialize, shouldn’t we put that into our tests? Why, yes.
My plan was thus: have my desired response in an XML file to load up and some mock data to load into the test. Getting the base was easy — I just took a snapshot of some query results and made that into the XML file. I would then deserialize the XML file into a DOM that should be the same DOM as the mock data (with JiBX, it becomes trivial to marshal and unmarshal objects into XML).
But how to get the XML object? I started by writing a Python script to read the XML and put that into a snippet of Java. To the people who don’t use scripting and/or dynamic languages this may seem strange. To me, this is natural — Python is much better at handling text and (IMHO) better at dealing with XML.
As I studied the resulting Java, I knew that I didn’t want to put that into a Java class. It was too hard to change! I didn’t want to run the script, put it into the Java class, massage it so it compiles, and then test. So I thought a while and then remembered I could to all this in Spring! I could take the XML and simply put into a Spring config, populating my data. Then, in my unit test, I could called the applicationContext
to get the object.
So I changed my Python script create an Spring config bean instead of a Java snippet. I put it into it’s own config, that I only call in my unit test. I named it mockdata.xml.
And, you know, it worked like a charm.
This shouldn’t have taken me all day, but it did. I spent too much time in the Java snippet instead of letting Spring to all the work for me. But, with the Python script, I have a re-usable solution so when the data needs to change, I can easily re-do my tests.
Testing should not only be fun, but correct as well.
Powered by ScribeFire.