More JVM Signal tricks - Thread control via mprotect

In my last post, I mentioned the JVM uses intentionally uses SIGSEGVS in other interesting ways. In this post I’ll give an overview of one of those other ways, synchronization for GC pauses.

If you ever look at generated assembly from Hotspot, you will find methods ending with very odd-looking tests like this:

test   %eax,0x16e71fa4(%rip)        # 0x00007fce84071000

At first glance, this doesn’t seem to serve any real purpose right before the return. Rather than doing any useful work, its just a way of trying to read from 0x7fce84071000, the designated polling page. Modifying access to this page lets the JVM stop threads cleanly - in places where the state of the world is well-known.

We can make the world stop

To stop the world, the JVM calls SafepointSynchronize::begin in safepoint.cpp, which calls os::make_polling_page_unreadable, the linux implementation of which calls mprotect on the page with PROT_NONE (no access allowed)

// Mark the polling page as unreadable
void os::make_polling_page_unreadable(void) {
  if( !guard_memory((char*)_polling_page, Linux::page_size()) )
    fatal("Could not disable polling page");
};
...
bool os::guard_memory(char* addr, size_t size) {
  return linux_mprotect(addr, size, PROT_NONE);
}

When our compiled code hit those tests, they’ll now segfault. Again looking back at Hotspot’s linux x86 signal handler, in os_linux_86.cpp

extern "C" JNIEXPORT int
JVM_handle_linux_signal(int sig,
                        siginfo_t* info,
                        void* ucVoid,
                        int abort_if_unrecognized) {
  ucontext_t* uc = (ucontext_t*) ucVoid;
  ...
  pc = (address) os::Linux::ucontext_get_pc(uc);
  
  ...
  
  if (thread->thread_state() == _thread_in_Java) {
    // Java thread running in Java code => find exception handler if any
    // a fault inside compiled code, the interpreter, or a stub

    if (sig == SIGSEGV && os::is_poll_address((address)info->si_addr)) {
      stub = SharedRuntime::get_poll_stub(pc);
    }
  ..

This case is handled via SharedRuntime::get_poll_stub, which returns the address of the handler, a wrapper around SafepointSynchronize::handle_polling_page_exception.

In Action

To see this in action, heres a quick demo. We start a 3 threads doing a bunch of (non object allocating) busy work, and then then manually issue a gc from the main thread after waiting long enough to be fairly confident they have been compiled.

import java.util.Random;

class GC {
  private static Random r = new Random();

  private static class Fib extends Thread {
    public int fib(int n) {
      int s1 = 1;
      int s2 = 1;
      for(int i = 0; i < n; i++) {
        int temp = s1 + s2;
        s1 = s2;
        s2 = temp;
      }
      return n;
    }

    @Override
    public void run() {
      long res = 0;
      for(int i = 0; i < 100000000; i++) {
        res += fib(r.nextInt(200000000));
      }
      System.out.println(res);
    }
  }

  public static void main(String[] args) throws InterruptedException {
    new Fib().start();
    new Fib().start();
    new Fib().start();
    Thread.sleep(2000);
    System.gc();
  }
}

Again, we can run this via strace

strace -f -o outfile java GC

and clearly see this in action - an mprotect preventing reads & writes is issued on the page starting at 0x7fce84071000, which all 3 processes quickly segfault trying to read from.

23172 mprotect(0x7fce84071000, 4096, PROT_NONE) = 0
23182 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x7fce84071000} ---
23180 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x7fce84071000} ---
23181 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x7fce84071000} ---

Nifty eh?

(Shameless plug: If you are, like me, crazy enough to find this sort of stuff interesting, follow me on twitter.