Large page use crashes the JVM on some Linux systems (original) (raw)

B. Blaser bsrbnd at gmail.com
Wed Apr 25 12:00:41 UTC 2018


[further private conversation summary]

On 24 April 2018 at 21:15, B. Blaser <bsrbnd at gmail.com> wrote:

On 24 April 2018 at 11:47, Claes Redestad <claes.redestad at oracle.com> wrote:

Hi Bernard,

On 2018-04-24 11:27, B. Blaser wrote: Hi Claes,

Thanks for your feedback, I'll try to improve the fix as suggested. someone pointed out we already do a sanity check similar to the one you're proposing.. src/hotspot/os/linux/oslinux.cpp: bool os::Linux::hugetlbfssanitycheck(bool warn, sizet pagesize) { [...] } It seems it'll warn only if you explicitly use -XX:+UseHugeTLBFS. -XX:+UseLargePages on linux first attempts to use UseHugeTLBFS, then falls back to -XX:+UseSHM. ... what errors do you see on your system when you run -version with -XX:+UseLargePages, -XX:+UseHugeTLBFS and -XX:+UseSHM respectively? Most systems aren't configured to use HugeTLBFS, so my guess is your system actually has an issue with UseSHM... I'm aware of this sanity check. The problem is that on my system 'mmap()' always fails and then the JVM attempts to use SHM instead. I'll check more deeply my configuration and read twice the kernel vm doc: https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt but, in short terms, both 'mmap()' and SHM can access large pages (2Mb on my computer) but it has to be enabled (also with SHM) which doesn't seem to be the case by default. So, to answer your questions: 1) -XX:+UseLargePages and -XX:+UseHugeTLBFS have the same effect than -XX:UseSHM because 'mmap' nicely complains when trying to use huge TLB and then SHM is used instead. 2) unfortunately, SHM doesn't complain (no problem when calling 'shmget' or 'shmat') but the allocated memory isn't aligned with the large page size (2Mb) which crashes the JVM (SHM probably allocates memory using the default page size even if requesting 2Mb pages - which I have to verify). In conclusion, the current JVM behavior of trying to use SHM if 'mmap()' fails seems to be brittle. I think, we have to check if large pages are supported/enabled when starting the JVM. Probably checking '/proc/meminfo' - '/proc/filesystems' - '/proc/sys/vm/nrhugepages' would be faster than calling 'mmap()'. I'll read again the kernel doc, but I think calling 'mmap()' is a robust "slow" way to see if large pages can be used but I agree that it doesn't tell if they are not enabled or not supported. What do you think we should do? Bernard /Claes Thanks, Bernard


On 24 April 2018 at 21:39, Claes Redestad <claes.redestad at oracle.com> wrote:

The root issue here could very well be that the SHM sanity test is insufficient. Adding the same test as we already do for TLBFS seems like the wrong approach.

I'm not the most knowledgeable about SHM, though, in fact not knowledgeable at all, so let's try and get you subscribed to hotspot-dev and spark a discussion on the list.

/Claes

In concrete terms (on my system):

$ grep "hugetlbfs" /proc/filesystems nodev hugetlbfs

$ grep -e "HugePages_" -e "Hugepagesize" /proc/meminfo HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB

Which means that huge pages are supported but not configured.

$ ./build/linux-x86_64-normal-server-release/jdk/bin/java -XX:+UseLargePages -version #

A fatal error has been detected by the Java Runtime Environment:

Internal Error (g1PageBasedVirtualSpace.cpp:49), pid=2914, tid=2915

guarantee(is_aligned(rs.base(), page_size)) failed: Reserved space

base 0x00007f5c20b10000 is not aligned to requested page size 2097152 #

JRE version: (11.0) (build )

Java VM: OpenJDK 64-Bit Server VM (11-internal+0-adhoc.devel.jdk,

mixed mode, aot, tiered, compressed oops, g1 gc, linux-amd64)

Core dump will be written. Default location: core.2914 (may not exist)

An error report file with more information is saved as:

/home/****/jdk/hs_err_pid2914.log

If you would like to submit a bug report, please visit:

http://bugreport.java.com/bugreport/crash.jsp

Aborted (core dumped)

$ ./build/linux-x86_64-normal-server-release/jdk/bin/java -XX:+UseHugeTLBFS -version OpenJDK 64-Bit Server VM warning: HugeTLBFS is not supported by the operating system. openjdk version "11-internal" 2018-09-25 OpenJDK Runtime Environment (build 11-internal+0-adhoc.devel.jdk) OpenJDK 64-Bit Server VM (build 11-internal+0-adhoc.devel.jdk, mixed mode)

$ ./build/linux-x86_64-normal-server-release/jdk/bin/java -XX:+UseSHM -version #

A fatal error has been detected by the Java Runtime Environment:

Internal Error (g1PageBasedVirtualSpace.cpp:49), pid=2974, tid=2975

guarantee(is_aligned(rs.base(), page_size)) failed: Reserved space

base 0x00007f8a06890000 is not aligned to requested page size 2097152 #

JRE version: (11.0) (build )

Java VM: OpenJDK 64-Bit Server VM (11-internal+0-adhoc.devel.jdk,

mixed mode, aot, tiered, compressed oops, g1 gc, linux-amd64)

Core dump will be written. Default location: core.2974 (may not exist)

An error report file with more information is saved as:

/home/****/jdk/hs_err_pid2974.log

If you would like to submit a bug report, please visit:

http://bugreport.java.com/bugreport/crash.jsp

Aborted (core dumped)

So, I guess the least the JVM should do is unconditionally disabling large page use when starting if 'HugePages_Total: 0' in '/proc/meminfo'.

But I'll investigate what can be done to improve SHM sanity check too.

Or maybe someone on hotspot-dev would have another idea?

Bernard


On 23 April 2018 at 11:18, Claes Redestad <claes.redestad at oracle.com> wrote:

[ /bcc amber-dev, /cc hotspot-dev ]

Hi, unconditionally mapping and unmapping a large page on startup seems sub-optimal to me - could this be checked directly after -XX:+UseLargePages flag has been parsed? I'd also note that explicitly configured large pages are typically a limited resource: does this test distinguish between a failure due the system not supporting the feature and a failure due not having any free pages left? Printing a "UseLargePages is unsupported" message in the latter case would be misleading. I wonder if checking something like /proc/meminfo for HugePages* is a more robust way to probe capabilities, and also whether this is more suited as a test harness feature, i.e., enhance jtreg and tag these tests so that they're ignored on systems that doesn't have any/enough huge pages. Thanks! /Claes

On 2018-04-22 23:18, B. Blaser wrote:

[ I've trouble subscribing to hotspot-dev, please forward if necessary. ] Hi, After a clean build, some hotspot tests related to large page use are failing on my 64-bit Linux system, for example: gc/g1/TestLargePageUseForAuxMemory.java [...] Or simply: $ ./build/linux-x8664-normal-server-release/images/jdk/bin/java -XX:+UseLargePages -version is crashing the JVM because the latter assumes that large pages are always supported on Linux, which appears to be wrong. I suggest to make sure that large pages are supported when parsing the arguments, as below. Does this look reasonable (tier1 looks better now)? Thanks, Bernard diff -r 8c85a1855e10 src/hotspot/share/runtime/arguments.cpp --- a/src/hotspot/share/runtime/arguments.cpp Fri Apr 13 11:14:49 2018 -0700 +++ b/src/hotspot/share/runtime/arguments.cpp Sun Apr 22 20:29:21 2018 +0200 @@ -60,6 +60,7 @@ #include "utilities/defaultStream.hpp" #include "utilities/macros.hpp" #include "utilities/stringUtils.hpp" +#include "sys/mman.h" #if INCLUDEJVMCI #include "jvmci/jvmciRuntime.hpp" #endif @@ -4107,6 +4108,18 @@ UNSUPPORTEDOPTION(UseLargePages); #endif +#ifdef LINUX + void *p = mmap(NULL, os::largepagesize(), PROTREAD|PROTWRITE, + MAPANONYMOUS|MAPPRIVATE|MAPHUGETLB, + -1, 0); + if (p != MAPFAILED) { + munmap(p, os::largepagesize()); + } + else { + UNSUPPORTEDOPTION(UseLargePages); + } +#endif + ArgumentsExt::reportunsupportedoptions(); #ifndef PRODUCT diff -r 8c85a1855e10 test/hotspot/jtreg/runtime/memory/LargePages/TestLargePagesFlags.java --- a/test/hotspot/jtreg/runtime/memory/LargePages/TestLargePagesFlags.java Fri Apr 13 11:14:49 2018 -0700 +++ b/test/hotspot/jtreg/runtime/memory/LargePages/TestLargePagesFlags.java Sun Apr 22 20:29:21 2018 +0200 @@ -37,7 +37,7 @@ public class TestLargePagesFlags { public static void main(String [] args) throws Exception { - if (!Platform.isLinux()) { + if (!Platform.isLinux() || !canUse(UseLargePages(true))) { System.out.println("Skipping. TestLargePagesFlags has only been implemented for Linux."); return; }



More information about the hotspot-dev mailing list