The presenter started with a disclaimer. Not for legal like Oracle. To emphasize that advice needs common sense and to be checked for applicability to your system.
Applications are slow because they are waiting for something. Need to find out what that is. We generally start looking in our application. But we should start with a larger view of hardware/os, jvm/memory, and actors/usage patterns.
Tips
- Make sure your test system is as close to production as possible. This includes the amount of data so you aren’t caching away your problem. In other words, you want to make sure you are solving the right problem.
- keep going until meets requirements or user decides to stop paying for performance improvements, whichever comes first
- Remember to benchmark before/after each change so know if worked and the impact
- Know how much CPU you can use. Even if you are only using forty percent, adding CPU does not help if you are only allocated forty percent
- Knowing which process using CPU helps. Is it your algorithms or the JVM for garbage collection or nothing in which case it is waiting on some other resource
- After CPU, go on to other diagnostics such as memory, paging, etc
- Want system to be calm when testing so can use CPU as a benchmark
The Tooling
The presenter ran an example to walk through the tools
- Added XX:PrintGCDetails
- JMeter – shows average time for points of interest real time. Both in numeric format and in a running graph. Found an out of memory error. Noted better to analyze garbage collection before increasing memory
- VisualVM – use memory profile to profile object allocations and GC. The count will show objects leaking because it will survive every attempt to collect and create more. The increasing generational count will show the leak. Note you can instrument every X counts instead of each one to save space. Can see it happen real time as you run. Then you can take a snapshot [ the snapshot view is what I’m used to seeing for CodeRanch work]. The monitoring tab shows in graphical format. the heap analysis shows all instances of objects of a certain type. Pick on at random and see heap walk. Select nearest GC root, this takes a minute or so. The GC root is what is anchoring it to live memory.
- Visual VM – use threads view to see if thread pool can be increased. Are there enough threads servicing the number of users? One way to tune is to set the thread pool size larger and then scale back until it is good.
- GC.log – a number of graphs on garbage collection statistics – how long it takes, distribution, etc
- Censum – garbage collection log analysis tool that hasn’t been released yet. It shows stats and distributions on the generations