(RFR)(S)(10): 8176768: hotspot ignores PTHREAD_STACK_MIN when creating new threads (original) (raw)
Chris Plummer chris.plummer at oracle.com
Mon Mar 20 23:29:13 UTC 2017
- Previous message: (RFR)(S)(10): 8176768: hotspot ignores PTHREAD_STACK_MIN when creating new threads
- Next message: (RFR)(S)(10): 8176768: hotspot ignores PTHREAD_STACK_MIN when creating new threads
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 3/17/17 11:37 PM, Chris Plummer wrote:
On 3/17/17 8:17 PM, Chris Plummer wrote:
On 3/17/17 7:01 PM, David Holmes wrote:
On 18/03/2017 9:11 AM, Chris Plummer wrote:
Looks like this will need some more work since the added asserts are triggering on mac os x (which is the only place we'd currently expect them to assert).
The problem is that the code I found that rounds up to the page size is only applied to java threads created by the VM for which the java user specified no stack size. The VM and Compiler thread sizes are not rounded. The failure I saw was with runtime/CommandLine/OptionsValidation/TestOptionsWithRanges.java when is specified -XX:CompilerThreadStackSize=9007199254740991. I hit the assert with an EINVAL. The size is not aligned, but it could also be complaining because it is too big. I haven't tried aligning it yet to see. On Linux we do the following: stacksize = alignsizeup(stacksize + os::Linux::defaultguardsize(thrtype), vmpagesize()); We don't do this on BSD. I think the same on BSD would solve this problem. I'm not sure about adding the guard size. I'll need to see if mac os x has the same pthread bug as linux does. At this stage I would only deal with alignment issues. IIRC the guard issue only affected Linux. Yes, that's what I eventually concluded. I put the fix in os::Posix::getinitialstacksize() in osposix.cpp, and only did the page aligning, not add the guard page. That way all Posix ports are fixed in one place. It seems to be working. BTW, did you know java users can specify the size of the a new thread's stack: Yes I mentioned that in another reply - wondering whether we suitably check and aligned such requests. No we don't. Below I mentioned I was able to trigger the assert with a 257k stack size. I guess I wasn't clear that I did that from Java. I have a new test to add that will be testing this, plus the 9007199254740991 stack size (which fails to create the thread with an OOME, but that's acceptable). The fix I mention above in os::Posix::getinitialstacksize() takes care of this issue also. Rounding up triggers a new assert, this time on 32-bit x86 and arm. I should have clarified it's 9007199254740991 "K", which is 9223372036854774784. Unfortunately on 32bit systems that is asserting with EINVAL. I think we need to do a better job of dealing with 32-bit sizet values: jlong javalangThread::stackSize(oop javathread) { if (stackSizeoffset > 0) { return javathread->longfield(stackSizeoffset); } else { return 0; } } jlong size = javalangThread::stackSize(JNIHandles::resolvenonnull(jthread)); // Allocate the C++ Thread structure and create the native thread. The // stack size retrieved from java is signed, but the constructor takes // sizet (an unsigned type), so avoid passing negative values which would // result in really large stacks. sizet sz = size > 0 ? (sizet) size : 0; nativethread = new JavaThread(&threadentry, sz); 9223372036854774784 is 0x7ffffffffffffc00 (close to 64 bit MAXINT), which is 0xfffffc00 when cast to a sizet on a 32-bit system (close to 32-bit MAXUINT). Round it up to the 4k page size and you get 0, which I guess pthreadattrsetstacksize() doesn't like. So I think more processing of the size is needed here. Maybe in os::createthread() we should check for 0 after rounding up, and subtract the os page size if it is 0. However, I think we should also avoid truncating on 32-bit to what is basically some random number. Maybe if "size" (a jlong) is greater than UINTMAX, then just set "sz" (a sizet) it to UINTMAX. Ok, I think I have this all worked out now. I've added fixes for unaligned stack sizes, 32-bit truncating of stack size, and the "aligning up to 0" problem. I also added a test. Here's the latest webrev:
http://cr.openjdk.java.net/~cjplummer/8176768/webrev.02/webrev.hotspot
Here's what's changed since webrev.01:
os_posix.cpp: In os::Posix::get_initial_stack_size(), first round up the stack size to be paged aligned. This fixes issues on Mac OS X (other platforms seem to be immune to this). Then check if the size is zero after rounding up to the page size. Subtract the page size in this case to produce the maximum stack size allowed. Surprisingly I got no complaint from gcc for subtracting from an unsigned value that is known to be 0.
os_linux.cpp: In os::create_thread(), I also check here to make sure the size is not 0 after adding the guard page and aligning up, and subtract the os page size if it is 0.
jvm.c: In JVM_StartThread(), on 32-bit platforms if the size is greater than UINT_MAX, then I set the size to UINT_MAX. Note it will later be rounded up to 0 in os::Posix::get_initial_stack_size(), which will result in subtracting the os page size to get the actual maximum allowed stack size.
TooSmallStackSize.java: added test case for unaligned stack sizes.
TestThreadStackSizes.java: New test. Creates new threads with every size up to 320k in 1k increments. Then creates threads with a few other sizes that can be problematic.
thanks,
Chris
Chris
public Thread(ThreadGroup group, Runnable target, String name, long stackSize) { init(group, target, name, stackSize); } Fortunately we already force the stackSize to be at least javathreadminstackallowed. However, we don't do any OS page rounding on Mac OS X as noted above, and I was able to trigger the assert by creating a thread with size 257k. Note this means that OSX stack logic is broken because it will currently be silently failing due to EINVAL! Yes, that is correct. Chris I'll get another webrev out once I've made the needed fixes. I also have a new test I'd like to add. Ok. Thanks, David Chris On 3/16/17 9:27 PM, Chris Plummer wrote: Ok, time for a new webrev:
http://cr.openjdk.java.net/~cjplummer/8176768/webrev.01/webrev.hotspot
The only thing that has changed since the first webrev are the asserts added to oslinux.cpp and osbsd.cpp. And to summarize what we discuss already: - The assert should never happen due to the stack size being too small since it will be at least PTHREADSTACKMIN. - The assert should never happen due to an unaligned stack size because we always align it to the page size. - Any assert would therefore be a VM bug and not due to user error. - No fixing the java launcher. If the user specifies a stack that is too small, hotspot will already detect this. If the user specifies a stack size that is large enough but not page aligned, then we just ignore any error (if the platform doth protest) and the user gets a main thread with a stack size set to whatever the OS default is. I still need to retest (I only ran TooSmallStackSize.java), but figure getting agreement on the changes first would be best before I bog down our testing resources. thanks, Chris On 3/15/17 10:03 PM, Chris Plummer wrote: Hello, Please review the following: https://bugs.openjdk.java.net/browse/JDK-8176768 http://cr.openjdk.java.net/~cjplummer/8176768/webrev.00/webrev.hotspot
While working on 8175342 I noticed our stack size on xgene was 8mb even though I was specifying -Xss72k. It turns out the following code was failing: pthreadattrsetstacksize(&attr, stacksize); Although we computed a minimum stack size of 72k, so -Xss72k should be fine, pthreads on this platform requires the stack be at least 128k, so it failed the pthreadattrsetstacksize() call. The end result is pthreadattrsetstacksize() had no impact on the thread's stack size, and we ended up with the platform default of 8mb. The fix is to round up the following variables to PTHREADSTACKMIN after computing their new values: javathreadminstackallowed compilerthreadminstackallowed vminternalthreadminstackallowed For solaris, there was an issue using PTHREADSTACKMIN. You need to #define POSIXCSOURCE >= 199506L in order to get PTHREADSTACKMIN #defined, and this needs to be done before including OS header files. I noticed that on solaris we were using thrminstack() elsewhere instead of PTHREADSTACKMIN, so I decided to do the same with this fix. Either way is ugly (the #define or using thrminstack()). And speaking of the existing use of thrminstack(), I deleted it. It was being applied before any adjustments to the stack sizes had been made (rounding and adding red, yellow, and shadow zones). This mean the stack ended up being larger than necessary. With the above fix in place, we are now applying thrminstack() after recomputing the minimum stack sizes. If for any reason one of those stack sizes is now too small, the correct fix is to adjust the initial stack sizes, not apply thrminstack() to the initial stack sizes. However, it looks like no adjustment is needed. I did something close to our nightly testing on all affect platforms, and no new problems turned up. thanks, Chris
- Previous message: (RFR)(S)(10): 8176768: hotspot ignores PTHREAD_STACK_MIN when creating new threads
- Next message: (RFR)(S)(10): 8176768: hotspot ignores PTHREAD_STACK_MIN when creating new threads
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]