Monday, March 10, 2014

Locks and 'Implausible' Behaviour


In a recent bug, we found some dodgy code that, although it didn't hold the lock it needed, should have worked most of the time. That is, it appeared that another thread would have had to have seized the lock and relinquished it thousands of times while the first thread just executing a simple line. This at first seemed implausible to us.

So, we did an experiment. The code below creates 2 threads that contend for a lock. Once they have the lock, they increment a counter and make a kernel call. A third thread gets the data every second and spits it out.

This managing thread executes code like this:

    private final Lock lock = new ReentrantLock();

    private void doTest() {
        Locker locker1 = new Locker(lock);
        Thread thread1 = new Thread(locker1);

        Locker locker2 = new Locker(lock);
        Thread thread2 = new Thread(locker2);

        locker1.setOther(locker2);
        locker2.setOther(locker1);

        thread1.start();
        thread2.start();

        while (true) {
            pauseForOneSecond();
            System.out.println(locker1);
            System.out.println(locker2);
            lock.lock();
            locker1.resetMax();
            locker2.resetMax();
            lock.unlock();
        }
    }

While the contending threads execute code like this:

class Locker implements Runnable {

    private final Lock lock;

    private long       count;

    private Locker     other;

    private long       lastOther;
   
    private long       maxDiff;

    private long       diff;

    private long       doNotOptimizeAway = currentTimeMillis();

    @Override
    public void run() {
        while (true) {
            increaseCount();
        }
    }

    void increaseCount() {
        lock.lock();
        count++;

        if (other.count == lastOther) {
            diff++;
        } else {
            diff = 0; // start again
        }

        maxDiff            = max(diff, maxDiff);
        // swap counts
        lastOther          = other.count;
        other.lastOther    = count;
        // reset the other because I won this time
        other.diff         = 0;
        doNotOptimizeAway ^= currentTimeMillis(); // kernel call
        lock.unlock();
    }
.
.

This code was run on a 16-core (using hyperthreading) E5620 (2.4GHz) Linux box running a 2.6.15.60-87.1-smp kernel.

Efficiency

The total of the 2 monotonically incrementing counters for Java 1.6 over five runs was:

74 609 025
73 729 561
73 284 303
77 254 600
78 838 417

whereas Java 1.7 yields

92 043 050
93 529 925
92 779 311
92 587 899
90 993 539

which indicates that Java 1.7 handles locks with greater efficiency as more work was done.

Fairness

The difference between the two threads' counters in Java 1.6 is:

2254
  80.28
 932.1
1898
 243.7

(in units of 10 000).

For Java 1.7 it was:

4459
 412.5
2486
4284
2286

which suggests Java 1.7 is less fair when arbitrating lock contention.

Bias

One of the things I tried to measure is how many times a lock can be grabbed in a row before the other contending thread seizes it back. The output looks like:

.
.
maxDiff = 6097
maxDiff = 6617
maxDiff = 5366
maxDiff = 7593
maxDiff = 10630
maxDiff = 5318
maxDiff = 7593
maxDiff = 11025
maxDiff = 31317
maxDiff = 10273
.
.

which suggests a lock can be held for thousands of consecutive calls before the other thread had a look-in.

Conclusion

Implausible as it sounds, race conditions that may look staggeringly unlikely in your code can be quite common as the JVM 'batches' lock operations to make the overall performance much more efficient.

No comments:

Post a Comment