Reading assembly from the JVM
30 Aug 2015The second important step on the way to Hotspot JIT understanding (after building a debug JVM) is to look at generated assembly - the code that java actually runs on your behalf. Unfortunately, few developers do this or even know that you can do this. It doesn’t have to be this way - while this requires a little work, its very doable. Here are a few different ways to do it.
A Simple example
I’ve written up a quick program which contains a function I want to look at - getLen
, which simply returns the length of a string. This wrapper calls it over and over on random input, saving the output to make ensure Hotspot doesn’t try and get too clever.
-XX:+PrintOptoAssembly
If you have a debug jdk built and running, this is the easiest but is also very unreadble. Simply run your JVM with -XX+PrintOptoAssembly
and you’re done! (I recommend piping the output for easier parsing).
Its worth mentioning that -XX:+PrintOptoAssembly
only prints the output of the c2 (aka ‘server’) compiler and not every method may get compiled by it.
Since the output is so messy, I’m going to skip looking at it.
-XX:+PrintAssembly
PrintAssembly
generates much more readable output, shows you code from both c1 and c2 compilers, and is usable from a regular jvm. Great! Unfortunately, you need to compile a library that does the actual disassembling, hsdis
, which does require a copy of the jdk source code. This isn’t too bad though - this wiki page of the great JitWatch project has links for how to do this on Windows, Mac, or Linux. I follow the linked linux steps without issue. Once you’ve built your library (on x86_64 this is hsdis-adm64.so
), copy it to the /lib/<platform>/server
directory of your java install.
1. I’ve manually turned off UseCompressedOops here to simplify the output
Our output file will actually contain 2 versions of getLen
- the first generated by the c1 compiler, the second by c2. Lets take a look at the 2nd, more optimized, version:
(The code below the return is exception handling, and will be covered at a later date).
Even without much assembly experience, this isn’t too bad to understand - here’s whats happening:
- Our function is passed its
String
in the%rsi
register - It finds the pointer to the string’s
value
character array which it stores in%r10
- It then gets the
length
field from the array (all arrays in Java have a length field) and puts that in%eax
to return to the caller - The
test
instruction is doing what is known as a safepoint poll, and can be ignored for now.
Notice that we don’t see any actual calls to String.length
- thats because Hotspot has decided to inline that function for us.
-XX:+CompileCommand
As you probably noticed in the last example, PrintAssembly
prints output for every single method it compiles, including internal JVM functions. A lot of the time we only want to see a single method or class. Thats where CompileCommand
comes in handy - you can specify only to print output for certain classes/methods. The full documentation is available here, but for this example, we only care about the print
command. To print our getLen
function in class Test
, we can pass the following
If our class Test
was in package com.foo
, we would pass print,com/foo/Test.getLen
instead. You can also pass *
as a Wildcard for the method or the class, so print,Test.*
would print for all methods in class Test
and *.getLen
would print any getLen
method no matter which class implemented it. Otherwise, this produces very similar output to PrintAssembly
(and similarly requires the hsdis
library).