Java Performance in 64bit land

If you were buying a new car and your primary goal was performance, or more specifically raw power – given the choice between a 4 cylinder and a 8 cylinder engine, the choice is obvious. Bigger is better. Generally when we look at computers the same applies, or at least that is how the products are marketed. Thus a 64bit system should out-perform a 32bit system, in the same way that a quad core system should be faster than a dual core.

Of course, what a lot of the world is only starting to understand is that more isn’t always better when it comes to computers. When dealing with multiple CPUs, you’ve got to find something useful for those extra processing units to do. Sometimes your workload is fundamentally single-threaded and you have to let all those other cores sit idle.

The 32bit vs. 64bit distinction is a bit more subtle. The x86-64 architecture adds not only bigger registers to the x86 architecture, but more registers. Generally this translates to better performance in benchmarks (as having more registers allows the compilers to create better machine code). Unfortunately until recently, moving from a 32bit java to a 64bit java mean taking a performance hit.

When we go looking at java performance, there are really 2 areas of the runtime that matter: the JIT and the GC. The job of the JIT is to make the code that is running execute as fast as possible. The GC is designed to take as little time away from the executing of code as possible (while still managing memory). Thus java performance is all about making the JIT generate more optimal code (more registers helps), and reducing the time the GC has to use to mange memory (bigger pointers makes this harder).

J9 was originally designed for 32bit systems and this influenced some of the early decisions we made in the code base. Years earlier I had spent some time with a PowerPC system that ran in 64bit mode trying to get our Smalltalk VM running on it, and had reached the conclusion that the most straight forward solution was simply to make all of the data structures (objects) twice as big to handle the 64bit pointers. With J9 development (circa 2001), one of the first 64bit systems we got our hands on was a Dec Alpha so we applied the straight forward ‘fattening’ solution, allowing a common code base to support both 32bits and 64bits.

A 64bit CPU will have a wide data bus, but recall that this same 64bit CPU can run 32bit code as well and it still has the big wide data bus to move things around with. When we look at our 64bit solution of allowing the data to be twice as big, we’re actually at a disadvantage relative to 32bits on the same hardware. This isn’t a problem unique to J9, or even Java – all 64bit programs need to address this data expansion. It turns out that the dynamics of the java language just tend to make this a more acute problem as java programs tend to be all about creating, and manipulating objects (aka data structures).

The solution to this performance issue is to be smarter about the data structures. This is exactly what we did in the IBM Java6 JDK with the compressed references feature. We can play tricks (and not get caught) because the user (java programmer) doesn’t know the internal representation of the java objects.

The trade off is that by storing less information in the object, we limit the total amount of memory that can be used by the JVM. This is currently an acceptable solution, as computer memory sizes are nowhere near the full 64bit address range. We only use 32bits to store pointers, and take advantage of 8 byte aligned objects to get a few free bits [ pointer << 3 ]. Thus the IBM Java6 JDK using compressed references (-Xcompressedrefs) can address up to 32Gb of heap.

We’re not the only ones doing this trick, Oracle/BEA have the -XXcompressedRefs option and Sun has the -XX:+UseCompressedOops option. Of course, each of the vendors implementations are slightly different with different limitations and levels of support.  Primarily you see these flags used in benchmarking, but as some of our customers are starting to run into heap size limitations on 32bit operating systems they are looking to move to 64bit systems (but would like to avoid giving up any performance).

There is a post on the websphere community blog that talks about the IBM JDK compressed references and has some pretty graphs showing the benefits. And Billy Newport gives a nice summary of why this feature is exciting.


I’ve decided to start including work related items in my blog here.  Please view the About page to see the standard disclaimer.  A fair number of the things I do at work can’t be discussed in public until they arrive somewhere in product form, by then they usually feel like old news to me and often information has been leaked via other channels.  Thus, some of the work related posts might seem a little boring to those “in the know” but I hope to help people put 3+4 together by linking to various bits of information, or simply provide a “straight from the developers” viewpoint.  I guess we’ll see how it goes, requests and feedback are welcome.

Recently Rick DeNatale posted a nice personal history of Smalltalk, I’m quite proud to have been part of building several of the products he mentions.  The OTI VM team started out building Smalltalk, but moved on to Java by first doing VisualAge for Java, followed by J9 for embedded.  Currently we still actively develop J9 which is primarily used as the core of the IBM JDK, but we still do embedded work as well as a Real-Time Java offering.

A Tale of Two iPods

So a while back I posted about my how my video iPod had stopped working.  My investigations pretty much pointed at the logic board being bad.  I do have a classic 512MB shuffle that I’ve been rocking, but I miss the display to figure out what album (or even sometimes artist) I’m listening to – also I had been using the video feature to watch transcoded TV shows from my MythTV PVR.

This had me surfing the apple refurb store and considering the $89 nano there since a replacement logic board will run somewhere around $90 (and I don’t get a warranty).  The local used market also was very tempting as video iPods seem to run around $100 – $120.

The site seems to generally have the best local prices, but I really wanted to find one for less than $100.  As luck would have it, someone posted a black 30G video iPod for $60 the other day – I leapt on the opportunity.  I figured at this price, it was going to be a little banged up, but as long as it worked I could do a transplant.

My busted iPod is on the left, the used (but working) iPod on the right.  Note the scratch on the screen area and tape on the lower right side.  Generally the surface has been badly scratched up, this iPod has had a rough life.

Even the metal back shows serious signs of wear.  There is also gunk inside of the dock connector making the cable connection a bit tricky (I’m pretty sure if I clean out the gunk – it will be fine).

Even the side of this case is starting to bust apart (thus the tape).  I’m sort of amazed that this unit still works – it sort of restores my faith in the quality of the iPod devices.  While $60 might have sounded like a too good to be true deal, based on its condition I paid a fair price for it.

Well, next step is to strip both units down and do a little transplant surgery…