Reading assembly from the JVM30 Aug 2015
The second important step on the way to Hotspot JIT understanding (after building a debug JVM) is to look at generated assembly - the code that java actually runs on your behalf. Unfortunately, few developers do this or even know that you can do this. It doesn’t have to be this way - while this requires a little work, its very doable. Here are a few different ways to do it.
A Simple example
I’ve written up a quick program which contains a function I want to look at -
getLen, which simply returns the length of a string. This wrapper calls it over and over on random input, saving the output to make ensure Hotspot doesn’t try and get too clever.
If you have a debug jdk built and running, this is the easiest but is also very unreadble. Simply run your JVM with
-XX+PrintOptoAssembly and you’re done! (I recommend piping the output for easier parsing).
Its worth mentioning that
-XX:+PrintOptoAssembly only prints the output of the c2 (aka ‘server’) compiler and not every method may get compiled by it.
Since the output is so messy, I’m going to skip looking at it.
PrintAssembly generates much more readable output, shows you code from both c1 and c2 compilers, and is usable from a regular jvm. Great! Unfortunately, you need to compile a library that does the actual disassembling,
hsdis, which does require a copy of the jdk source code. This isn’t too bad though - this wiki page of the great JitWatch project has links for how to do this on Windows, Mac, or Linux. I follow the linked linux steps without issue. Once you’ve built your library (on x86_64 this is
hsdis-adm64.so), copy it to the
/lib/<platform>/server directory of your java install.
1. I’ve manually turned off UseCompressedOops here to simplify the output
Our output file will actually contain 2 versions of
getLen - the first generated by the c1 compiler, the second by c2. Lets take a look at the 2nd, more optimized, version:
(The code below the return is exception handling, and will be covered at a later date).
Even without much assembly experience, this isn’t too bad to understand - here’s whats happening:
- Our function is passed its
- It finds the pointer to the string’s
valuecharacter array which it stores in
- It then gets the
lengthfield from the array (all arrays in Java have a length field) and puts that in
%eaxto return to the caller
testinstruction is doing what is known as a safepoint poll, and can be ignored for now.
Notice that we don’t see any actual calls to
String.length - thats because Hotspot has decided to inline that function for us.
As you probably noticed in the last example,
PrintAssembly prints output for every single method it compiles, including internal JVM functions. A lot of the time we only want to see a single method or class. Thats where
CompileCommand comes in handy - you can specify only to print output for certain classes/methods. The full documentation is available here, but for this example, we only care about the
getLen function in class
Test, we can pass the following
If our class
Test was in package
com.foo, we would pass
print,com/foo/Test.getLen instead. You can also pass
* as a Wildcard for the method or the class, so
print,Test.* would print for all methods in class
*.getLen would print any
getLen method no matter which class implemented it. Otherwise, this produces very similar output to
PrintAssembly (and similarly requires the