Zero allocation patterns in Java

An excessive rate of objects allocation, however fast object creation is in Java, can dramatically impact the overall performance of an application. Too many objects created in too little time will increase pressure on the garbage collector, resulting in more frequent stop the world pauses,  which in turn translate to jitter and/or degraded response time for the end user.

Low latency applications follow two broad strategies to work around this issue:

1- gc tuning: increasing the total heap available to the JVM will reduce the frequency of stop the world pauses (but not their duration). Allocating more threads to parallel gcs, re-sizing the eden/survivor/tenured space may also help. However these settings will become out of date sooner or later when the volume or distribution of the data processed by the application change, and will then need to be re-evaluated.

2- use non allocating patterns: basically strive to reduce the number of objects created to reduce the workload on the garbage collector. For example:

Profile, profile, profile
Identify allocation hotspots in the code using a profiler such as the Eclipse memory analyzer, YourKit, JProfiler… Make fixes to remove the hotspot and repeat as long as necessary.

Use primitives instead of primitive wrappers and objects
Prefer int to Integer, double to Double, char to Character…etc.  Primitives are allocated on the stack and therefore are not garbage collected.

The same reasoning extends to data structures: a SparseArray (a map which uses primitives  for its keys) will be more memory efficient than a HashMap which uses objects for both keys and values.

Also worth mentioning Trove,  a library dedicated to primitives-only collections in Java.

Reconsider your logging strategy
Logging is a source of allocation. Try reduce the logging level and the amount of information being logged. If you must log then consider your logger implementation carefully to pick the lowest allocation logging implementation.

Go off-heap
direct buffers are allocated outside of the heap and hence are not subject to the vagaries of the garbage collector. Better suited for long-lived objects such as the app static data.

Use object pools
It is well established that the concept of immutability leads to better quality code. This is a fundamental tenet of the functional programming paradigm and most functional languages enforce immutability, at least by default.

In certain circumstances this can lead to the creation of an excessive number of objects: e.g listening to market data updates from  multiple external feeds where each feed publishes thousands (if not millions) messages per second. If each incoming message creates an object in memory then that’s a lot of objects for the JVM to keep up with.An alternative is to create a pool of objects which are kept in memory and reused for each incoming feed update.

Now pooling has a bad rap in javaland – and this is deserved to some extent. This technique does lead to more complicated code, is more-error prone, and can actually hurt performance in multi-threaded environments where each resource managed by the pool has to be thread-safe. However there are ways to achieve pooling without elaborate locking schemes, and the end-result is well-worth the additional development time.

Benchmarking String.intern() with JMH

String interning overview

From the Oracle Javadocs:

String.intern() returns a canonical representation for the string object.

In other words interned strings are pooled so that there is one instance of every string (the canonical representation) in memory. This also means interned strings can be  compared using the ‘==’ operator rather than equals() since there’s no possibility of having two identical strings with two different memory addresses (provided all strings are interned).

The downside is that invoking intern() is going to be more taxing in cpu time than a mere string allocation.

The upside is that interning optimises for memory consumption – function of how many dynamically built strings are generated by the application, and how many of these strings are unique.

Microbenchmarking

The cost/benefit of string interning need to be assessed on a case-by-case basis by taking appropriate time measurements.

The classic way to do so is to rely on a stopwatch to calculate the elapsed time before and after the operation being measured. This technique works relatively well for large, macro benchmarks when the operation being measured takes more than a few seconds eg. a database lookup.

Stopwatches however fail to take into account the many tricks used by the jvm to optimize the code at runtime: warmup, inlining, dead code elimination, loop unrolling etc. and this can lead to  biased results when dealing with millis/microseconds measurements.  A preferred option in that case is to use a microbenchmarking framework for Java, such as Caliper or JMH , which will generate benchmark code taking into account the above pitfalls.

Benchmark example with JMH

[gist  https://gist.github.com/eleco/d4096caa751eda96bf8f /]

Results

In the scenario above interning improves performance significantly. The cost of the intern() method is moot when it significantly reduces gc pressure overall.

JMH_interning

Testing shell scripts.

This story highlights the potential dire consequences of undefined variables in Shell scripts.

In summary: entire directories were lost because of a buggy shell script running this command:

rm -rf "$VAR/" *

.. which is intended to delete all files under the $VAR directory. However if $VAR is left undefined, then what is executed instead is:

rm -rf *

… which will delete all files in the current directory and subdirectories. oops.

It’s somewhat paradoxical that Shell scripts are frequently used to drive mission-critical activities such as starting/stopping processes, copying, moving and deleting files… and yet these scripts are error-prone, often hard to read and always hard to test.

DCIM100SPORT

Mitigations

Being at the mercy of a buggy shell script is not fun. Thankfully there are ways to prevent disaster from happening.

  • Add “set -eu” to the script.
    “-e” causes the script to terminate if any command fails. “-u” causes the script to terminate when it encounters an unbound variable. One additional line of code which will save lot of trouble… This should be mandatory on top of every script.

  • Check if the variables are set before use
    [[ "$VAR" ]] && rm -rf "$VAR/*"
    It’s not foolproof though as it wont prevent failure due to typos in the variable name.

  • Use a unit test framework
    Yes shell scripts have their unit-test frameworks too, see Roundup or ShUnit.

  • Use a (real) programming language
    bash/zsh were never meant to be used for complex programming tasks. Replace with Python,  or Perl, or Groovy, or Java. Programming languages have much better support for functions, variable scoping, conditionals, string handling …etc. compared to a shell script. The idea is to avoid shell scripts save for the simplest tasks, and use a programming language to handle any kind of elaborate control logic (anything more complex than a pair of if/else statements).