[QCon 2019] Maximizing Performance with GraalVM

Thomas Wuerthinger 

For other QCon blog posts, see QCon live blog table of contents

Tradeoff between what factors optimized

  • Startup time
  • Peak throughput
  • Memory footprint
  • Maximizing request latency
  • Packaging size (matters for mobile)
  • Can usually optimize a few (but not all) of these

GraalVM

  • Supports JVM languages, Rubby, Python, C, Rust, R etc
  • Can embed in node js, oracle database
  • Standalone binary
  • Community Edition and Enterprise Edition
  • Can run with Open JDK using Graal JIT compiler or AOT (ahead of time compiling)

AOT

  • To use, create new binary with pre-compiled code
  • Package classes from app, libraries used and part of the VM
  • Iterate adding things until know what need. Then create native executable.
  • Uses an order of magnitude less memory than JIT. Saving memory helps when running on AWS Lambda
  • CPU usage a lot less up front. Small peak at startup
  • JIT compiler has profiling feedback so can do better in the long run. AOT has PGO (profile guided optimizations) to deal with this
  • Working on improving – collecting profiles up front, low latency GC option and tracing agent to facilitate configuration

Performance

  • Startup time (from start until first request can be served). Two orders of magnitude faster with AOT
  • Starting up in less than 50 milliseconds allows spinning up new process upon request
  • Hard to measure. Can be lucky/unlucky when get data.
  • JIT has an advantage for peak performance. It has profiling data and can make optimistic assumptions. If the assumption not true, can de-optimize/bail out of optimization.

Benchmarks

  • Benchmarks are good. Should have more
  • Optimizing on too few benchmarks is like overfitting on machine learning
  • http://renaissance.dev/ – benchmark suite. Includes Scala and less commonly tested

Choosing

  • GraalVM JIT – when need peak throughput, max latency and no config
  • GraalVM AOT – use when need fast startup time, small memory footprint and small packaging size

Recommends reading top 10 things to do with GraalVM

Q&A

  • Have you considered using Epsilon in benchmark? Not yet. Makes sense since doesn’t do any GC
  • Why not use parallel GC? Not sure if it would make a difference. Kirk noted would avoid allocation hit over G1.
  • Does AOT make sense for large heaps? Can make sure don’t have disadvantage at least.

My impressions

I had heard about Graal and forgotten a lot. I re-learned much. I like the list of steps slides and the diagram. I feel like it will be more memorable this time. I also liked the comparison at the end on impact of the dimensions covered up front.

jvm death match – live blogging from qcon

JVM Death Match
Speakers:
Daniel Heidinga – IBM
Gil Tene – Azul
Thomas Wuerthinger – Oracle

See the list of all blog posts from the conference

This was a joint session of the NY Java Sig and the ACGNJ group. Fun fact – they have the URLs javasig.com and javasig.org respectively.

Graal Vision and Architecture – Thomas at Oracle

  • Java is still the primary language on the JVM but lots of others.
  • Graal compiler runs on top of JVM and can run JVM languages.
  • Truffle Framework – allows running Ruby, R and JavaScript on JVM
  • Sulong runs on top of Truffle and adds support for C/C++
  • Can mix and match languages
  • Vision: become more polyglot and more embedable

Zing – Gil at Azul

  • Only company that builds nothing but JVMs
  • Zulu is Open JDK. Open JDK only produces source code; not binaries. Zulu is 100% open source. Differntiate for embedded platforms.
  • Zing is the differentiation, namely speed.
  • Gil went over the graph about optimization that we saw in his session earlier in the day
  • Falcon is the jit compiler
  • Logic to pre-tune so runs at speed right faster

Open J9 – Daniel at IBM

  • Number 1 cloud runtime
  • In cloud, memory costs more than CPU. Three times smaller than Open JDK in benchmark
  • Have stripped down JDK so smaller image
  • Trace engine and dump engine. Free diagnostics tools – important to be able to see what JVM is doing
  • Work with hardware vendors
  • Plan to open source J( before Java 9 launches

Selection of the Q&A

  • Why use JVM? IBM said #1 cloud JVM. Azul said Open JDK for and for best tuned for Zing. Oracle said can combine with other language or compile to native code. Also Oracle disputed the performance claim.
  • How important is polyglot? Azul said have to be able to beat existing runtime to be useful. IBM said tried to create the universal bytecode and didn’t work. Oracle said performing well. Oracle said there is interest because big investment in Java source code with business logic and want to use Node.JS for small apps so can reuse. Azul said hard because people have current tool in place. I wanted to ask why this over microservices. Azul and IBM both brought up that they think that is the future. Oracle said microservices are painful over just calling the data structures. Graal allows calling Java data structures from other languages now. Azul teased him that not in prod yet.
  • R becoming more popular due to machine learning. What about speed? Oracle noted that R is very slow and interpretted so Graal helps a lot
  • What about calling C from Java? Oracle said project Panama does that. A future version of Truffle will do that.
  • Who is working on optimizing regular bytecode? Source code knows more than the bytecode does such as generics. IBM looked at but creates new problem – expolding templates – use more memory that way. Azul mourned Java 5 not going that route.
  • Javac converting lambas to a virtul call. All three panelists immedidately said the JVM can tune that.
  • Do IBM clients have prod experience with J9? Yes. Been a product for 20 years and upgraded regularly.
  • How does Oracle manage different versions? Need to pick a version of the language, not mix and match. Can use interoperability of each run in own space.
  • How does IOT affect the memory footprint? IBM said Java might not be right choice for very memory constrained environment. Beyond that, stripped down JDK could be a good choice. Azul said Zulu embedded goes into things like routers and printers. Current boundary is 11-20 MB of storage and mid-high tens of MB to run. Happy JVM can’t run light bulbs given recent hack on light bulbs. Oracle looking at what parts of JRE using and turning those parts into machine code. Does contain GC, but not many other things. Has restrictions so can’t use things like generics/reflection.
  • Do any JVMs have hard limit on memory used? Azul said yes and again teased Oracle about their product not being in production. Azul also said elastic garbage collector so kernal gets memory back as soon as GC happens. IBM has soft MX so JVM doesn’t exceed the limit for the heap. Azul noted the problem is that JVMs have dedicated padding because might need later. Providing shared padding gives this confidence – dynamically expand and shrink “insurance memory”. IBM has detection for idle resources so other processes can use that memory as headroom
  • Is Java the right language for things that appear and go away due to warm up period – serverless? Azul said it should be and working on that problem now. Even with front loading, a lot of CPU sed on startup. Working on almost instant startup but that is future. IBM saves JIT status and profile code to decrease startup time as well. Need to keep JVM around for some length of time to minimize effect of cold starts. Oracle said can produce quick start if you restrict functionality used. Moving around program beocmes less expensive compared to moving around data. Azul said don’t want to limit features. IBM said AOT is a great bandaid to solve the startup problem.
  • What happens when reach limit on number of cores? Azul disagreed with question and cited we’ve been hearing about the end of Moores Law for ages. Speed over time still increasing. Oracle said never enough so people will want more machines.