Saturday, April 17, 2010

Null Desperandum

A junior developer asked me this week when she should use Integer objects over int primitives. Many classes (eg, java.util.Calendar) use ints to represent quantities - eg, month, day of the month etc and this makes perfect sense since there is no point in time with a null month. But what about the other way around, she asked? When would you have to use Integers?

I was a bit stumped at first. Although it's intuitively obvious to me that there are such occasions, I couldn't think of an example off the top of my head. So, I just waffled about null having a semantic meaning in certain circumstances.

Later, I thought of this one: imagine an application for lawyers in which a class captures a defendant's answer. If it were modelled using an int, you might see something like this in the system:

Application: How many days has it been since you last beat your wife?

Defendant: I've never beaten my wife.

Application: OK, I'll just put you down as "0 days since you last beat your wife".

Null would have been a reasonable value here and hence I'd argue an Integer should be used. An int of -1 (ie, "-1 day(s) ago") favoured in some APIs could possibly be interpreted as "tomorrow".

Damn my slow, coffee-addled brain.

Calendar Grrrs!

Gainfully employed once more, my first task has been to diagnose why a trading application sometimes grinds to a halt. Avoiding all the speculation that generally surrounds non-deterministic exceptions and performance issues, I put together a JMeter suite and ran a soak test. After roughly 60 minutes of being stressed, the application starts to barf. Cool - it's reproducible. But the reason why it started choking surprised me.

The application does a lot of copying of object graphs using Serialization. Fine - JVMs are pretty good at memory management these days and creating objects, even large numbers of them, is normally OK. In fact, a Sun engineer once told me that you should only start worrying about Garbage Collection when it reaches a hefty 5% of running time. This trading application was spending about 1% of time collecting garbage despite JConsole showing that the Heap Memory was bouncing up and down like a yo-yo.

So, time to get more invasive and attach the amazing YourKit (sadly not open source but worth its weight in gold). It showed that when client connections were timing out, the server-side threads servicing them were contending for a monitor on java.util.Hashtable.get() (Hashtable unlike HashMap has its methods synchronized, of course).






Whatsmore, YourKit was telling me that it was java.util.Calendar.getInstance() that was calling this method on Hashtable. Firing up Eclipse and putting a breakpoint in java.util.Hashtable.get() showed exactly the path of execution that lead to this:

Thread [main] (Suspended (breakpoint at line 333 in Hashtable))
Hashtable.get(Object) line: 333
GregorianCalendar(Calendar).setWeekCountData(Locale) line: 2445
GregorianCalendar(Calendar).(TimeZone, Locale) line: 931
GregorianCalendar.(TimeZone, Locale) line: 574
Calendar.createCalendar(TimeZone, Locale) line: 1006
Calendar.getInstance() line: 943

Calendar.getInstance() creates a GregorianCalendar and the constructor of this accesses cachedLocaleData, a static field of type java.util.Hashtable. Since it's static, all threads in the JVM can contend for its monitor.

Ouch.

This monitor issue may seem trivial but if you're calling this method hundreds of times a second, it becomes an issue.

Of course, it would have been better had the business domain objects that were being copied via serialization not implemented java.io.Externalizable and had their readExternal and writeExternal methods call Calendar.getInstance(). This was due to a crufty hack to satisfy an architectural problem (Sybase 12.5 not handling time zones in its timestamp data type, but that's another story).

So, we're faced with a few possible solutions. One of which the team mentioned was to use Joda time which is (in one form or another) going to make its way into the standard Java API.

Of course, I'd like to see a more architectural solution - grumble, grumble.