UseNUMA membind Issue in openJDK (original) (raw)

Swati Sharma swatibits14 at gmail.com
Thu Apr 26 12:20:21 UTC 2018


Hi Everyone,

I work at AMD and this is my first patch as a new member of openJDK community.

I have found some issue while running specjbb2015 composite workload with the flag -XX:+UseNUMA. It seems that JVM does not allocate memory according to the explicit node binding done using "numactl --membind".

E.g. If bound to a single memory node, JVM divides the whole heap based on the total number of numa nodes available on the system which creates more logical groups(lgrps) than required which cannot be used except the one.

The following examples will explain clearly : (Note : Collected GC logs with -Xlog:gc*=debug:file=gc.log:time,uptimemillis)

  1. Allocating a heap of 22GB for single node divides the whole heap in 8 lgrp(Actual no of Nodes are 8) $numactl --cpunodebind=0 --membind=0 java -Xmx24g -Xms24g -Xmn22g -XX:+UseNUMA

    eden space 22511616K(22GB), 12% used lgrp 0 space 2813952K, 100% used lgrp 1 space

2813952K, 0% used lgrp 2 space 2813952K, 0% used lgrp 3 space 2813952K, 0% used lgrp 4 space 2813952K, 0% used lgrp 5 space 2813952K, 0% used lgrp 6 space 2813952K, 0% used lgrp 7 space 2813952K, 0% used

Observation : Instead of disabling UseNUMA for single node binding JVM divides the memory in 8 lgrps and allocates memory always on the bounded node hence in eden space allocation never happens more than 12%.

  1. Another case of binding to node 0 and 7 results in dividing the heap in 8lgrp $numactl --cpunodebind=0,7 –membind=0,7 java -Xms50g -Xmx50g -Xmn45g -XX:+UseNUMA

    eden space 46718976K, 6% used lgrp 0 space 5838848K, 14% used lgrp 1 space 5838848K,

0% used lgrp 2 space 5838848K, 0% used lgrp 3 space 5838848K, 0% used lgrp 4 space 5838848K, 0% used lgrp 5 space 5838848K, 0% used lgrp 6 space 5838848K, 0% used lgrp 7 space 5847040K, 35% used

Observation : Similar to first case allocation happens only on 0th and 7th node and rest of the lgrps never gets used.

After applying the patch, JVM divides the given heap size according to the bounded memory nodes only.

  1. Binding to single node disables UseNUMA eden space 46718976K(45GB), 99% used

Observation : UseNUMA gets disabled hence no lgrp creation and the whole heap allocation happens on the bounded node.

  1. Binding to node 0 and 7 $ numactl --cpunodebind=0,7 –membind=0,7 java -Xms50g -Xmx50g -Xmn45g -XX:+UseNUMA eden space 46718976K(45GB), 99% used lgrp 0 space 23359488K(23.5GB), 100% used lgrp 7 space 23359488K(23.5GB), 99% used

Observation : Only two lgrps gets created and heap size gets divided equally in both nodes.

If there is no binding, then JVM will divide the whole heap based on the number of NUMA nodes available on the system.

The following patch fixes the issue(attached also). Please review and let me know your comments.

Regression testing using jtreg (make -J=1 run-test-tier1 run-test-tier2) didn't show any new failures.

===============================PATCH======================================== diff --git a/src/hotspot/os/linux/os_linux.cpp b/src/hotspot/os/linux/os_linux.cpp --- a/src/hotspot/os/linux/os_linux.cpp +++ b/src/hotspot/os/linux/os_linux.cpp @@ -2832,8 +2832,10 @@ // Map all node ids in which is possible to allocate memory. Also nodes are // not always consecutively available, i.e. available from 0 to the highest // node number.

"numa_bitmask_isbitset"))); set_numa_distance(CAST_TO_FN_PTR(numa_distance_func_t, libnuma_dlsym(handle, "numa_distance")));

"numa_set_membind")));

"numa_get_membind")));

   if (numa_available() != -1) {
     set_numa_all_nodes((unsigned long*)libnuma_dlsym(handle,

"numa_all_nodes")); @@ -3054,6 +3060,8 @@ os::Linux::numa_set_bind_policy_func_t os::Linux::_numa_set_bind_policy; os::Linux::numa_bitmask_isbitset_func_t os::Linux::_numa_bitmask_isbitset; os::Linux::numa_distance_func_t os::Linux::_numa_distance; +os::Linux::numa_set_membind_func_t os::Linux::_numa_set_membind; +os::Linux::numa_get_membind_func_t os::Linux::_numa_get_membind; unsigned long* os::Linux::_numa_all_nodes; struct bitmask* os::Linux::_numa_all_nodes_ptr; struct bitmask* os::Linux::_numa_nodes_ptr; @@ -4962,8 +4970,9 @@ if (!Linux::libnuma_init()) { UseNUMA = false; } else {

NUMA. UseNUMA = false; } } diff --git a/src/hotspot/os/linux/os_linux.hpp b/src/hotspot/os/linux/os_linux.hpp --- a/src/hotspot/os/linux/os_linux.hpp +++ b/src/hotspot/os/linux/os_linux.hpp @@ -228,6 +228,8 @@ typedef int (*numa_tonode_memory_func_t)(void *start, size_t size, int node); typedef void (*numa_interleave_memory_func_t)(void *start, size_t size, unsigned long *nodemask); typedef void (*numa_interleave_memory_v2_func_t)(void start, size_t size, struct bitmask mask);

unsigned int n); @@ -244,6 +246,8 @@ static numa_set_bind_policy_func_t _numa_set_bind_policy; static numa_bitmask_isbitset_func_t _numa_bitmask_isbitset; static numa_distance_func_t _numa_distance;

: NULL;

_numa_bitmask_isbitset(bmp, node)) {

: NULL;

long))); i++) {

1)) == 0)) {

#endif // OS_LINUX_VM_OS_LINUX_HPP diff --git a/src/hotspot/share/runtime/os.hpp b/src/hotspot/share/runtime/os.hpp --- a/src/hotspot/share/runtime/os.hpp +++ b/src/hotspot/share/runtime/os.hpp @@ -81,6 +81,10 @@ CriticalPriority = 11 // Critical thread priority };

+extern "C" struct bitmask {

=============================================================================

Thanks, Swati



More information about the hotspot-dev mailing list