-
Notifications
You must be signed in to change notification settings - Fork 96
New Malloc Trace
The old malloc trace implementation depended on a glibc feature called malloc hooks. This feature was deprecated and is now removed from glibc versions >= 2.34. See this Red Hat blog for the reasoning behind the decision.
In order to keep providing a malloc trace, a new implementation was build. This uses a preloaded library, which intercepts all malloc related system calls from the C library.
The new implementation has the following benefits:
- support for Alpine linux and MacOSX in addition to glibc based linux systems
- the option to track the deallocation of memory, to get the 'live' memory
- the option to track only a fraction of the allocation calls to reduce overhead (both memory and CPU time)
The trace needs a library call libmallochooks to be preloaded at the start of the VM. This is usually done by the VM itself via the -XX:+UseMallocHooks
flag. There are scenarios when this doesn't work. For example when the VM is not launched via java
, but loading libjvm.so
in a launcher. In this case the launcher has to be started with the following environment variable set:
- For Linux:
LD_PRELOAD=<path-to-vm>/lib/libmallochooks.so
- For MacOSX:
DYLD_INSERT_LIBRARIES=<path-to-vm>/lib/libmallochooks.dylib
A VM started with -XX:+UseMallocHooks
can then be instructed to start a malloc trace with the jcmd <pid> MallocTrace.enable
. This command support several options:
-
-stack-depth
: This is the maximum stack length to store when tracking an allocation. -
-use-backtrace
: If this flag is supplied, the stack walking is done via thebacktrace
method of the glibc, if available. On MacOSX and Alpine linux the VM tries to loadlibunwind
instead, for the same functionality. If neither could be found, the fallback build-in method to walk the stack is used. Using the fallback is usually faster, but might lead to stack traces with less accurate information. -
-only-nth
: If given, not every allocation is tracked. For example-only-nth=3
would lead to only every third allocation to be tracked. This leads to a smaller memory footprint and performance overhead. Note that the sampling is done somewhat randomly.