RFR(S): 8211932: [ppc][testbug] runtime/jni/terminatedThread/TestTerminatedThread.java fails as threads don't terminate immediately (original) (raw)
Doerr, Martin martin.doerr at sap.com
Thu Oct 11 14:37:06 UTC 2018
- Previous message: RFR(S): 8211932: [ppc][testbug] runtime/jni/terminatedThread/TestTerminatedThread.java fails as threads don't terminate immediately
- Next message: RFR(S): 8211932: [ppc][testbug] runtime/jni/terminatedThread/TestTerminatedThread.java fails as threads don't terminate immediately
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Götz,
thanks for the fix. Looks good to me, too. It's a little unfortunate that we can't verify that the bean eventually returns -1, but ok. The test still seems to fulfill its purpose, so I'm fine with it.
Best regards, Martin
-----Original Message----- From: hotspot-runtime-dev <hotspot-runtime-dev-bounces at openjdk.java.net> On Behalf Of David Holmes Sent: Donnerstag, 11. Oktober 2018 14:34 To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; hotspot-runtime-dev at openjdk.java.net Subject: Re: RFR(S): 8211932: [ppc][testbug] runtime/jni/terminatedThread/TestTerminatedThread.java fails as threads don't terminate immediately
On 11/10/2018 10:12 PM, Lindenmaier, Goetz wrote:
Something like that yes. :) Can I consider this a review ? :))
Yes :)
Thanks, David
Best regards, Goetz.
-----Original Message----- From: David Holmes <david.holmes at oracle.com> Sent: Donnerstag, 11. Oktober 2018 13:53 To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; hotspot-runtime- dev at openjdk.java.net Subject: Re: RFR(S): 8211932: [ppc][testbug] runtime/jni/terminatedThread/TestTerminatedThread.java fails as threads don't terminate immediately
On 11/10/2018 9:23 PM, Lindenmaier, Goetz wrote: Hi,
I may just have to kill off this part of the test You mean we should skip the tests for -1? Like this: http://cr.openjdk.java.net/~goetz/wr18/8211931-terminatedThrd/02/ ? Something like that yes. :) The main thing this test is doing is ensuring we don't crash when we encounter these terminated-but-still-attached threads. Thanks, David Best regards, Goetz.
-----Original Message----- From: David Holmes <david.holmes at oracle.com> Sent: Donnerstag, 11. Oktober 2018 08:03 To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; hotspot-runtime- dev at openjdk.java.net Subject: Re: RFR(S): 8211932: [ppc][testbug] runtime/jni/terminatedThread/TestTerminatedThread.java fails as threads don't terminate immediately Hi Goetz, On 11/10/2018 1:01 AM, Lindenmaier, Goetz wrote: Hi David, I implemented your little experiment, and did 4 runs with my fix. I copied you the relevant output here: http://cr.openjdk.java.net/~goetz/wr18/8211931- terminatedThrd/01/withmyfix.txt Your code completes in one loop. From my output you can see that the CPU time is increasing a little, but after 3-4 iterations the thread goes away. I also did 4 runs without my fix: http://cr.openjdk.java.net/~goetz/wr18/8211931- terminatedThrd/01/withoutmyfix.txt I got 3 failures, one pass. Also here, your code completes in one loop. Many thanks for doing that. This is so perplexing. While adding the extra loops as per your fix may have solved your problem, it will make the recycled thread-id problem that we have seen in stress testing even more likely. I may just have to kill off this part of the test. It's wasting too many cycles just to try and check we are graceful went encountering a terminated unattached thread. Thanks, David Best regards, Goetz.
-----Original Message----- From: David Holmes <david.holmes at oracle.com> Sent: Mittwoch, 10. Oktober 2018 14:32 To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; hotspot- runtime- dev at openjdk.java.net Subject: Re: RFR(S): 8211932: [ppc][testbug] runtime/jni/terminatedThread/TestTerminatedThread.java fails as threads don't terminate immediately Hi Goetz, On 10/10/2018 8:25 PM, Lindenmaier, Goetz wrote: Hi David, This failure is very well reproducible, but only on linuxppc64 and linuxppc64le. That doesn't really make sense to me. I would not expect the process/thread lifecycle management code to be different based on the CPU involved. This should be a simple kernel + NPTL/libc issue. I implemented this fix in July, just missed the RDP, and the patch is used in our nightly builds since then. Since that date I don't see a single failure. We run these nightly tests with the fastdebug build, though. But linuxx8664, linuxs390x don't show the issue, nor all the other platforms. As there is no special high load, and because it's that well reproducible, I don't think I read the information of a thread of another process with the same thread id. With the output I implemented in the test, I see that the cpu time keeps increasing a bit, then it's stable for a few iterations, and then -1. That can also be explained by a thread-id being recycled and then the new thread also terminating. Granted the timing and reproducibility makes that unlikely. This is quite bizarre and I don't like bizarre. :) Are you able to apply this patch to the test and run some tests on ppc? if ((res = pthreadjoin(thread, NULL)) != 0) { fprintf(stderr, "TEST ERROR: pthreadjoin failed: %s (%d)\n", strerror(res), res); exit(1); } + while (pthreadkill(thread, 0) == 0) { + res++; + } + printf("Native thread was gone after %d iterations\n", res); return nativeThread; } Once pthreadkill gives ESRCH then so should pthreadgetcpuclockid(). At least until the thread-id is recycled. Thanks, David Best regards, Goetz.
-----Original Message----- From: David Holmes <david.holmes at oracle.com> Sent: Mittwoch, 10. Oktober 2018 01:22 To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; hotspot- runtime- dev at openjdk.java.net Subject: Re: RFR(S): 8211932: [ppc][testbug] runtime/jni/terminatedThread/TestTerminatedThread.java fails as threads don't terminate immediately Hi Goetz, There is already an open bug for this issue - JDK-8208159 - but it has only reproduced in a stress environment where we think thread-id's are being recycled (which means waiting longer won't help). This should be OS not CPU specific so I'm very interested to know in what circumstances you see this failure. I created an instrumented version of the test that did a pthreadkill on the target to check for ESRCH - which it got - yet we still see failures in those stress environments. David On 10/10/2018 1:10 AM, Lindenmaier, Goetz wrote: Hi, On ppc, one still sees increasing thread cpu times after a thread has joined. This makes TestTerminatedThread fail. This change gives the check a few seconds to wait until the thread disappears. Please review. http://cr.openjdk.java.net/~goetz/wr18/8211931- terminatedThrd/01/test/hotspot/jtreg/runtime/jni/terminatedThread/TestT erminatedThread.java.udiff.html Best regards, Goetz.
- Previous message: RFR(S): 8211932: [ppc][testbug] runtime/jni/terminatedThread/TestTerminatedThread.java fails as threads don't terminate immediately
- Next message: RFR(S): 8211932: [ppc][testbug] runtime/jni/terminatedThread/TestTerminatedThread.java fails as threads don't terminate immediately
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]