Sunday, April 24, 2011

To Infinispan and Beyond!

I've been playing for the last few weeks with JBoss's Infinispan - a distributed cache and the successor to JBossCache.

It's a nice piece of technology. But for our needs, it may be inappropriate.

It may be OK for storing users' HTTP sessions and the like. But we were thinking about storing data for which integrity is absolutely essential (bond orders that run typically into the billions of dollars). Let me run through a small number of arguments that made me think Infinispan is not for us.

The Infinispan FAQ says it uses MVCC to optimize performance by having write locks not block read locks. However, dig deeper and you see this is not MVCC as most people who are familiar with Oracle and Postgres would know it. In these database implementations, a reading thread works with a snapshot of data. If a writing thread updates that row in the database, then that's OK. The reading thread continues with its snapshot. This is great since nobody is blocking anybody and it can improve performance.

However, with Infinispan, this is not quite what is happening. It's true that write threads do not block read threads. But the data in question may be mutable.

In the Infinispan mailing lists, we read:

"[I]n this case you may well see the change before [a write transaction] commits, we don't explicitly clone or copy mutable objects ... I suppose though we could add the ability to defensively copy mutable objects, but we'd need a way of knowing which are immutable, etc. Also, this would be more expensive, depending on the size of the atomic map."

A quick-and-dirty test I wrote demonstrates this:


package org.infinispan.replication;

import static org.testng.AssertJUnit.assertNull;

import java.util.concurrent.CountDownLatch;

import javax.transaction.HeuristicMixedException;
import javax.transaction.HeuristicRollbackException;
import javax.transaction.NotSupportedException;
import javax.transaction.RollbackException;
import javax.transaction.SystemException;
import javax.transaction.TransactionManager;

import org.infinispan.Cache;
import org.infinispan.config.Configuration;
import org.infinispan.test.MultipleCacheManagersTest;
import org.infinispan.test.TestingUtil;
import org.infinispan.test.AbstractCacheTest.CleanupPhase;
import org.testng.AssertJUnit;
import org.testng.annotations.Test;

@Test(groups = "functional", testName = "replication.SyncReplLockingAtomicityPHTest")
public class SyncReplLockingAtomicityPHTest extends MultipleCacheManagersTest {

private static final String CACHE_NAME = "testcache";

private static final String k = "key", v = "value";

public SyncReplLockingAtomicityPHTest() {
cleanup = CleanupPhase.AFTER_METHOD;
}

protected Configuration.CacheMode getCacheMode() {
return Configuration.CacheMode.REPL_SYNC;
}

protected void createCacheManagers() throws Throwable {
Configuration cfg = getDefaultClusteredConfig(getCacheMode(), true);
cfg.setLockAcquisitionTimeout(500);
createClusteredCaches(2, CACHE_NAME, cfg);
}

public void testUpdateWhileReadLock() throws Exception {
final Cache cache = cache(0, CACHE_NAME);
final CountDownLatch latchAfterRead = new CountDownLatch(1);
final CountDownLatch latchBeforeCommit = new CountDownLatch(1);
final ReaderThread readerRunnable = new ReaderThread(k, cache, latchAfterRead, latchBeforeCommit);
final UpdateThread updateRunnable = new UpdateThread(k, cache);
updateWhileReadTX(latchAfterRead, latchBeforeCommit, readerRunnable, updateRunnable);
}

private void updateWhileReadTX(
CountDownLatch latchAfterRead,
CountDownLatch latchBeforeReadCommit,
ReaderThread readerRunnable,
UpdateThread updateRunnable) throws SecurityException, IllegalStateException, RollbackException, HeuristicMixedException, HeuristicRollbackException, SystemException, NotSupportedException, InterruptedException {
Cache cache1 = cache(0, "testcache");
Cache cache2 = cache(1, "testcache");
assertClusterSize("Should only be 2 caches in the cluster!!!", 2);
assertNull("Should be null", cache1.get(k));
assertNull("Should be null", cache2.get(k));

final StringBuffer initialObj = populateCache(cache1);
final Thread readThread = new Thread(readerRunnable);
readThread.start();
latchAfterRead.await();
final StringBuffer readObject = readerRunnable.getFromCache();

final Thread updateThread = new Thread(updateRunnable);
updateThread.start();
updateThread.join();
final StringBuffer writeObject = updateRunnable.getValue();
final String writeSnapshot = writeObject.toString();
final String readSnapshot = readObject.toString();

latchBeforeReadCommit.countDown();

AssertJUnit.assertSame(readObject, writeObject);
AssertJUnit.assertFalse(readSnapshot.equals(writeSnapshot));
}

private StringBuffer populateCache(Cache cache) throws SecurityException, IllegalStateException, RollbackException, HeuristicMixedException, HeuristicRollbackException, SystemException, NotSupportedException {
StringBuffer mutableObject = new StringBuffer("test");

TransactionManager mgr = TestingUtil.getTransactionManager(cache);
mgr.begin();
cache.getAdvancedCache().lock(k);
cache.put(k, mutableObject);
mgr.commit();
return mutableObject;
}

private class UpdateThread implements Runnable {

protected final String key;
protected final Cache cache;
protected final TransactionManager mgr;
private StringBuffer fromCache;

public UpdateThread(String k, Cache cache) {
this.key = k;
this.cache = cache;
mgr = TestingUtil.getTransactionManager(cache);
}

@Override
public void run() {
try {
mgr.begin();
getValueFromCache();
fromCache.append(System.currentTimeMillis());
cache.put(k, fromCache);
} catch (Exception x) {
x.printStackTrace();
} finally {
finishTx();
}

}

private void finishTx() {
try {
mgr.commit();
} catch (Exception x) {
x.printStackTrace();
}
}

private void getValueFromCache() {
fromCache = (StringBuffer) cache.get(key);
}

public StringBuffer getValue() {
return fromCache;
}

}


private class ReaderThread implements Runnable {

private final String key;
private final Cache cache;
private final CountDownLatch latchAfterRead, latchBeforeCommit;

private StringBuffer fromCache;

public ReaderThread(String k, Cache cache,
CountDownLatch latchAfterRead, CountDownLatch latchBeforeCommit) {
this.key = k;
this.cache = cache;
this.latchAfterRead = latchAfterRead;
this.latchBeforeCommit = latchBeforeCommit;
}

public StringBuffer getFromCache() {
return fromCache;
}

@Override
public void run() {
TransactionManager mgr = TestingUtil.getTransactionManager(cache);
try {
mgr.begin();
fromCache = (StringBuffer) cache.get(key);
latchAfterRead.countDown();
} catch (Exception x) {
x.printStackTrace();
} finally {
try {
latchBeforeCommit.await();
mgr.commit();
} catch (Exception x) {
x.printStackTrace();
}
}
}

}
}



The line in bold is where the test fails. The snapshot the read-thread has is the same as the write-thread, even though it read the object from the cache first. This is simply due to the fact that both read- and write-threads have the same object reference.

Note, if the write operations were rolled back, the read-thread would still see the change it had made. In fact, the whole cache from this point on would hold incorrect data.

This is not dissimilar to Hibernate updating an object (eg, assigning a primary key or a timestamp) even when the DB transaction it was in was rolled back. I've seen this happen when there were DB deadlocks and my transaction was chosen as the victim.

There are many other reasons why I think business critical data should not be put into Infinispan. I hope to present them in upcoming blog entries.

Useful SQL

Imagine an audit table of deal histories where every change to the deal means one more row in the table for a given deal ID.

How do you find the deals with the first audit time within a given time window?

[I'm posting this on my blog because I come across this problem quite often but never seem to remember the solution. By putting it here, I can copy-and-paste it in the future :) ]

So, imagine the audit table for deal_history looks a little like this:

+----+----------------+----------------+
|name|last_update_time|last_update_user|...

+----+----------------+----------------+

Such that the data looks like:

+-------+-------------------+----+--
|deal #1|03/14/2011 14:00:00
|Tom |...
+-------+-------------------+----+--
|deal #1|03/14/2011 15:33:33
|Dick|...
+-------+-------------------+----+--
|deal #2|03/14/2011 16:22:22
|Tom |...
+-------+-------------------+----+--
| ... ... ..
+-------+-------------------+----+--

The trick is to select over the same table twice thus:

select d1.last_update_time, d1.name, d1.last_update_user
from deal_history d1 , deal_history d2

where d1.last_update_time >= '03/14/2011 14:00:00'
and d1.last_update_time <= '03/21/2011 15:00:00'
and d1.deal_id = d2.deal_id
group by d1.last_update_time, d1.name , d1.last_update_user
having d1.last_update_time = min(d2.last_update_time)


Making sure the unique identifier for a give deal is equal in both sets of data (note: in our example, this is not a PK since there are many rows per deal ID).

Also, note that the group by clause has fields in the same order and that the having clause must follow the group by.

Three reasons to avoid Subversion

I've been using CVS for most of the last 15 years. I came to the Subversion party late. Despite all my friends saying how wonderful it was, I didn't use it professionally until 2009.

Initially, I was reasonably happy with it but over time I have come to see the wisdom of Linus Torvalds statement that Subversiom is "the most pointless project ever started."

Reason #1
The integration with Eclipse is patchy. No matter how great the tool, if the integration with your IDE of choice is buggy then avoid it unless absolutely necessary.

Eclipse sometimes says I have checked everything in when I haven't. It says I have sometimes checked everything out when I haven't. It can get confused during merges - especially if you have been moving files around. And this seems to be the case over a number of versions of Eclipse I have used in the last 2 years (most recently Helios) using both Subversive and Subclipse plugins. A few members of my team and I have been forced to avoid Eclipse altogether and use the Tortoise Subversion plugin.

When Subversion gets its knickers in a twist, it's harder to manually edit the metadata files in the .svn directory since they appear to be binaries. With CVS, they were plain text files. Of course, it's always nasty to do this. But sometimes, you just gotta.

Reason #2
Subversion can have old JARs stored in the .svn directories. As an example, I picked a random project from my hard drive (Gilead) and looked in the .svn directory.

phillip:1.3.1-SNAPSHOT henryp$ pwd
/Users/henryp/Documents/Code/gilead/gilead/maven-snapshots-repo/net/sf/gilead/comet4gwt/1.3.1-SNAPSHOT
phillip:1.3.1-SNAPSHOT henryp$ jar -tf .svn/text-base/comet4gwt-1.3.1-20100205.214253-2.jar.svn-base | more
META-INF/
META-INF/MANIFEST.MF
net/
net/sf/
net/sf/gilead/
net/sf/gilead/annotations/
net/sf/gilead/comet/
net/sf/gilead/exception/
net/sf/gilead/annotations/Comet.class
.
.
.


So, a living, breathing JAR in the metadata directory!

You might ask: "so what?". Well, this caused a lot of confusion on a project that loaded all its JARs into the classpath using a shell script that used a wild card. (Yes, I know that is bad practise in these days of OSGi et al but bad practices are unfortunately part of life and probably will continue to be until we are a regulated profession...) For a few hours, we were wondering why classes from a library that had long since been removed were showing up in our code.

Reason #3
When a friend breathlessly told me about Subversion, he said you could move files around and Subversion would understand. Poor CVS simply saw it as one file being deleted and another being created.

Well, this is not the entire story. If you move a file, the change history can move with it and that's about it.

“A lesser-known fact about Subversion is that it lacks “true renames”—the svn move command is nothing more than an aggregation of svn copy and svn delete.”
Version Control with Subversion (For Subversion 1.5, r3305) p106

This may be tolerable for just one or two classes. But, let's say you need to do a major refactoring. I'm sure you've been here before:

"Files and directories are shuffled around and renamed, often causing great disruption to everyone on the project. Sound like the perfect case for a branch, doesn't it? Just create a branch, shuffle things around, then merge the branch back into trunk, right?

"Alas, this scenario doesn't work so well right now and is considered one of Subversion's current weak spots."
Version Control with Subversion

“The moral of this story is that until Subversion improves, be very careful about merging copies and renames from one branch to another.”
Version Control with Subversion (For Subversion 1.5, r3305), p107

I found during a recent mass-refactor that if I moved a file then moved the file again (having not checked in until I was satisfied), Subversion would simply panic.

Conclusion
Subversion will not fulfill the dream of having many branches where you can refactor to your heart's content and merge later. If you want that, try another tool (I'm starting to get interested in Git).

The bottom line is that Subversion is only useful if you have one branch that's in production and one development branch. This is OK for many projects. But if you are working on a project that has teams from New York to London to Bangalore working on different product lines as I am currently, I'd strongly recommend that you avoid it.