Saturday, May 4, 2013

Double Trouble

I'm writing a maths library in my spare time for fun. I've only got around to matrices but already I am seeing interesting behaviour.

Primarily, my library needs to be fast so I use primitives. My first matrix implementation uses doubles and I'm testing performance by calculating cross products.

I'm using the Caliper microbenchmarking framework to performance test my code with methods that look a little like:

    public int timeMutable2DDoubleMatrixCross(int reps) {

        Mutable2DDoubleMatrix accumulator = null;
        for (int i = 0 ; i < reps ; i++) {
            accumulator = mutable2DDoubleMatrix.cross(mutable2DDoubleMatrix);
        }
        return accumulator == null ? 0 : accumulator.getWidth();
    }

On my machine*, I get a result of  1.299ms (standard deviation of 0.011ms) to run an iteration of my code.

But you know when I wrote other implementations for other primitive types, there was an awful lot of copy, paste and make a minor change going on. Bearing in mind the usual autoboxing caveats I've mentioned before, what if I just changed the interface to use Double but the implementation and the call site were using primitive doubles? Maybe then I could make the code more generic with no loss of performance.

No joy - we still get the overhead of autoboxing. Average call time is now 4.041ms (standard deviation of 0.040ms). The JVM is not going to be fooled.

Ah, well. I was half expecting that and it's still better than using BigDecimal (that was taking 68.354ms per call.)

But what was really interesting is that double was faster than long (typically 1.403ms per call, 0.013ms std dev). That blew out of the water my plan to use inflated longs and then right rotate them to give approximate answers, echoing Kronecker's assertion that "God made natural numbers; all else is the work of man".


This might be because floating point calculations are passed to the Floating Point Unit. "In most computers, floating point arithmetic is usually much slower than integer arithmetic, though on the Intel Pentium it is usually faster because the integer unit was not given the same care as the floating point unit." [1]

As ever, your mileage may vary (YMMV).

* A 1.8 GHz Intel Core i7 running Mac OS X 10.7.5 and using java version 1.6.0_41.

[1] http://mindprod.com/jgloss/floatingpoint.html

Additional reading:
 - interesting read on the history of floating points (http://www.intel.com/standards/floatingpoint.pdf).
 - stats for the i7 processor (http://elrond.informatik.tu-freiberg.de/papers/WorldComp2012/PDP2833.pdf)


No comments:

Post a Comment