Review Request: UseNUMAInterleaving (original) (raw)

Deneau, Tom tom.deneau at amd.com
Mon May 16 10:54:11 PDT 2011

Previous message: Request for reviews (S): 7044725: -XX:-UnrollLimitCheck -Xcomp : Exception: String index out of range: 29488
Next message: Review Request: UseNUMAInterleaving
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Please review this patch which adds a new flag called UseNUMAInterleaving. This flag provides a subset of the functionality provided by UseNUMA, and its main purpose is to provide that subset on OSes like Windows which do not support the full UseNUMA functionality. In UseNUMA terminology, UseNUMAInterleaved makes all memory "numa_global" which is implemented as interleaved.

The situations where this shows the biggest benefits would be:

Windows platforms with multiple numa nodes (eg, 4)
The JVM process is run across all the nodes (not affinitized to one node).
A workload that uses the majority of the cores in the machine, so that the heap is being accessed from many cores, including remote ones.
Enough memory per node and a heap size such that the default heap placement policy on windows would end up with the heap (or nursery) placed on one node.

jbb2005 and SPECPower_ssj2008 are examples of such workloads. In our measurements, we have seen some cases where the performance with UseNUMAInterleaving was 2.7x vs. the performance without. There were gains of varying sizes across all systems.

As currently implemented this flag is ignored on Linux and Solaris since they already support the full UseNUMA flag.

The webrev is at http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.01/

Summary of changes:

Other than adding the new UseNUMAInterleaving global flag, all of the changes are in src/os/windows/vm/os_windows.cpp
Some static routines were added to set things up init time. These
- check that the required APIs (VirtualAllocExNuma, GetNumaHighestNodeNumber, GetNumaNodeProcessorMask) exist in the OS
- build the list of numa nodes on which this process has affinity
Changes to os::reserve_memory
- There was already a routine that reserved pages one page at a time (used for Individual Large Page Allocation on WS2003). This was abstracted to a separate routine, called allocate_pages_individually. This gets called both for the Individual Large Page Allocation thing mentioned above and for UseNUMAInterleaving (for both small and large pages)
- When used for NUMA Interleaving this just goes thru the numa node list in a round-robin fashion, using a different one for each chunk (with 4K pages, the minimum allocation granularity is 64K, with 2M pages it is 1 Page)
- Whether we do just a reserve or a combined reserve/commit is determined by the caller of allocate_pages_individually
  - When used with large pages, we do a Reserve and Commit at the same time which is the way it always worked and the way it has to work on windows.
  - For small pages, only the reserve is done, the commit will come later. (which is the way it worked for non-interleaved)
os::commit_memory changes
- If UseNUMAIntereaving is true, os::commit_memory has to check whether it was being asked to commit memory that might have come from multiple Reserve allocations, if so, the commits must also be broken up. We don't keep any data structure to keep track of this, we just use VirtualQuery which queries the properties of a VA range and can tell us how much came from one VirtualAlloc call.

I do not have a bug id for this.

-- Tom Deneau, AMD

Previous message: Request for reviews (S): 7044725: -XX:-UnrollLimitCheck -Xcomp : Exception: String index out of range: 29488
Next message: Review Request: UseNUMAInterleaving
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the hotspot-compiler-dev mailing list