Skip to content

Why Java is Memory Hungry

Weixing Sun edited this page Mar 1, 2019 · 8 revisions

Analysis SPECjbb2015 model from OS level

  1. Introduce Memory Bandwidth

  2. Introduce Memory Latency

  3. Stream / MLC

  4. BCC Tools

  5. Measurements

  6. Conclusion

  7. Future works

Details:

  1. Introduce Memory Bandwidth

Memory bandwidth is the maximum memory allocation rate of a physical server

  1. Introduce Memory Latency

Memory latency is the timing for a system from allocating a memory region to a process.

NUMA awared architectured server should be sensitive to this measurement, because remote memory allocation latency is 2~3x compare to local

  1. Stream / MLC

Stream is a tool to measure the max throughput of a server

MLC is a tool to measure the local/remote memory allocation latency

  1. BCC Tools

BCC tools is a very neat set of open-source profiling/tracing tools, which is available at: /~https://github.com/iovisor/bcc

  1. Measurements

There are 2 steps: bcc on stream & bcc on jbb2015

Hardware: CPU: Intel Xeon E-2186G CPU @ 3.80GHz Memory: 2x 32GB DDR4 2666MHz

Software: OS: SUSE SLE-15 Tools: BCC, Stream

bcc on stream:

st250:/home/sun/jbb/stream/archive # cat func_arg_kmalloc.log

[18:25:31]

 size                : count     distribution

     0 -> 1          : 0        |                                        |
     2 -> 3          : 0        |                                        |
     4 -> 7          : 36       |****************************************|
     8 -> 15         : 15       |****************                        |
    16 -> 31         : 12       |*************                           |

st250:/home/sun/jbb/stream/archive # cat funccount_kmalloc.log

Tracing 1 functions for "__kmalloc"... Hit Ctrl-C to end.

FUNC COUNT

__kmalloc 728

Detaching...

st250:/home/sun/jbb/stream/archive # cat funccount_kmalloc_slab.log

Tracing 1 functions for "kmalloc_slab"... Hit Ctrl-C to end.

FUNC COUNT

kmalloc_slab 919

Detaching...

st250:/home/sun/jbb/stream/archive # cat funclatency_kmalloc.log Tracing 1 functions for "__kmalloc"... Hit Ctrl-C to end.

 nsecs               : count     distribution
     0 -> 1          : 0        |                                        |
     2 -> 3          : 0        |                                        |
     4 -> 7          : 0        |                                        |
     8 -> 15         : 0        |                                        |
    16 -> 31         : 0        |                                        |
    32 -> 63         : 0        |                                        |
    64 -> 127        : 0        |                                        |
   128 -> 255        : 0        |                                        |
   256 -> 511        : 68       |************                            |
   512 -> 1023       : 203      |**************************************  |
  1024 -> 2047       : 211      |****************************************|
  2048 -> 4095       : 58       |**********                              |
  4096 -> 8191       : 19       |***                                     |
  8192 -> 16383      : 10       |*                                       |
 16384 -> 32767      : 6        |*                                       |

Detaching... st250:/home/sun/jbb/stream # cat funclatency_kmalloc_slab.log Tracing 1 functions for "kmalloc_slab"... Hit Ctrl-C to end.

 nsecs               : count     distribution
     0 -> 1          : 0        |                                        |
     2 -> 3          : 0        |                                        |
     4 -> 7          : 0        |                                        |
     8 -> 15         : 0        |                                        |
    16 -> 31         : 0        |                                        |
    32 -> 63         : 0        |                                        |
    64 -> 127        : 0        |                                        |
   128 -> 255        : 220      |****************************************|
   256 -> 511        : 215      |*************************************** |
   512 -> 1023       : 144      |**************************              |
  1024 -> 2047       : 70       |************                            |
  2048 -> 4095       : 29       |*****                                   |
  4096 -> 8191       : 16       |**                                      |
  8192 -> 16383      : 5        |                                        |

Detaching...

bcc on jbb2015:

st250:/home/sun/jbb/18-11-12_184210 # cat func_arg_kmalloc.log

 size                : count     distribution

     0 -> 1          : 0        |                                        |
     2 -> 3          : 0        |                                        |
     4 -> 7          : 36       |****************************************|
     8 -> 15         : 15       |****************                        |
    16 -> 31         : 12       |*************                           |

st250:/home/sun/jbb/18-11-12_184210 # cat funccount_kmalloc.log

Tracing 1 functions for "__kmalloc"... Hit Ctrl-C to end.

FUNC COUNT

__kmalloc 796

Detaching...

st250:/home/sun/jbb/18-11-12_184210 # cat funccount_kmalloc_slab.log

Tracing 1 functions for "kmalloc_slab"... Hit Ctrl-C to end.

FUNC COUNT

kmalloc_slab 56951

Detaching...

st250:/home/sun/jbb/18-11-12_184210 # cat funclatency_kmalloc.log

Tracing 1 functions for "__kmalloc"... Hit Ctrl-C to end.

 nsecs               : count     distribution
     0 -> 1          : 0        |                                        |
     2 -> 3          : 0        |                                        |
     4 -> 7          : 0        |                                        |
     8 -> 15         : 0        |                                        |
    16 -> 31         : 0        |                                        |
    32 -> 63         : 0        |                                        |
    64 -> 127        : 0        |                                        |
   128 -> 255        : 0        |                                        |
   256 -> 511        : 0        |                                        |
   512 -> 1023       : 170      |**********************                  |
  1024 -> 2047       : 304      |****************************************|
  2048 -> 4095       : 42       |*****                                   |
  4096 -> 8191       : 32       |****                                    |
  8192 -> 16383      : 4        |                                        |

Detaching...

st250:/home/sun/jbb/18-11-12_184210 # cat funclatency_kmalloc_slab.log

Tracing 1 functions for "kmalloc_slab"... Hit Ctrl-C to end.

 nsecs               : count     distribution
     0 -> 1          : 0        |                                        |
     2 -> 3          : 0        |                                        |
     4 -> 7          : 0        |                                        |
     8 -> 15         : 0        |                                        |
    16 -> 31         : 0        |                                        |
    32 -> 63         : 0        |                                        |
    64 -> 127        : 0        |                                        |
   128 -> 255        : 49       |                                        |
   256 -> 511        : 989      |*                                       |
   512 -> 1023       : 38808    |****************************************|
  1024 -> 2047       : 19864    |********************                    |
  2048 -> 4095       : 1987     |**                                      |
  4096 -> 8191       : 280      |                                        |
  8192 -> 16383      : 25       |                                        |
 16384 -> 32767      : 9        |                                        |

Detaching...


  1. Conclusion

We can see that memory bandwidth is fairly equal in above 2 tests, which means JVM is very memory hungry.

If your customer needs performance in JVM, ask for more memory throughput. or reduce memory allocation rate.

In fact, the bytecode in JVM is heavily rely on memory , because almost every instruction is stack based, this gives JVM capability to run programs across CPU architectures, because it just skipped various registers.

  1. Future works

We know that memory allocation is different in heap and in stack. Will try to figure out how to split the alloc address from BCC tools.