RFR(S): 8211932: [ppc][testbug] runtime/jni/terminatedThread/TestTerminatedThread.java fails as threads don't terminate immediately (original) (raw)
Lindenmaier, Goetz goetz.lindenmaier at sap.com
Wed Oct 10 15:01:44 UTC 2018
- Previous message: RFR(S): 8211932: [ppc][testbug] runtime/jni/terminatedThread/TestTerminatedThread.java fails as threads don't terminate immediately
- Next message: RFR(S): 8211932: [ppc][testbug] runtime/jni/terminatedThread/TestTerminatedThread.java fails as threads don't terminate immediately
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi David,
I implemented your little experiment, and did 4 runs with my fix. I copied you the relevant output here: http://cr.openjdk.java.net/~goetz/wr18/8211931-terminatedThrd/01/with_my_fix.txt
Your code completes in one loop. From my output you can see that the CPU time is increasing a little, but after 3-4 iterations the thread goes away.
I also did 4 runs without my fix: http://cr.openjdk.java.net/~goetz/wr18/8211931-terminatedThrd/01/without_my_fix.txt I got 3 failures, one pass. Also here, your code completes in one loop.
Best regards, Goetz.
-----Original Message----- From: David Holmes <david.holmes at oracle.com> Sent: Mittwoch, 10. Oktober 2018 14:32 To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; hotspot-runtime- dev at openjdk.java.net Subject: Re: RFR(S): 8211932: [ppc][testbug] runtime/jni/terminatedThread/TestTerminatedThread.java fails as threads don't terminate immediately
Hi Goetz, On 10/10/2018 8:25 PM, Lindenmaier, Goetz wrote: > Hi David, > > This failure is very well reproducible, but only on linuxppc64 and linuxppc64le. That doesn't really make sense to me. I would not expect the process/thread lifecycle management code to be different based on the CPU involved. This should be a simple kernel + NPTL/libc issue. > I implemented this fix in July, just missed the RDP, and the patch is used > in our nightly builds since then. Since that date I don't see a single > failure. We run these nightly tests with the fastdebug build, though. > But linuxx8664, linuxs390x don't show the issue, nor all the other > platforms. As there is no special high load, and because it's that > well reproducible, I don't think I read the information of a thread of another > process with the same thread id. > With the output I implemented in the test, I see that the cpu time keeps > increasing a bit, then it's stable for a few iterations, and then -1. That can also be explained by a thread-id being recycled and then the new thread also terminating. Granted the timing and reproducibility makes that unlikely. This is quite bizarre and I don't like bizarre. :) Are you able to apply this patch to the test and run some tests on ppc? if ((res = pthreadjoin(thread, NULL)) != 0) { fprintf(stderr, "TEST ERROR: pthreadjoin failed: %s (%d)\n", strerror(res), res); exit(1); } + while (pthreadkill(thread, 0) == 0) { + res++; + } + printf("Native thread was gone after %d iterations\n", res); return nativeThread; } Once pthreadkill gives ESRCH then so should pthreadgetcpuclockid(). At least until the thread-id is recycled. Thanks, David > Best regards, > Goetz. > > >> -----Original Message----- >> From: David Holmes <david.holmes at oracle.com> >> Sent: Mittwoch, 10. Oktober 2018 01:22 >> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; hotspot-runtime- >> dev at openjdk.java.net >> Subject: Re: RFR(S): 8211932: [ppc][testbug] >> runtime/jni/terminatedThread/TestTerminatedThread.java fails as threads >> don't terminate immediately >> >> Hi Goetz, >> >> There is already an open bug for this issue - JDK-8208159 - but it has >> only reproduced in a stress environment where we think thread-id's are >> being recycled (which means waiting longer won't help). This should be >> OS not CPU specific so I'm very interested to know in what circumstances >> you see this failure. >> >> I created an instrumented version of the test that did a pthreadkill on >> the target to check for ESRCH - which it got - yet we still see failures >> in those stress environments. >> >> David >> >> On 10/10/2018 1:10 AM, Lindenmaier, Goetz wrote: >>> Hi, >>> >>> On ppc, one still sees increasing thread cpu times after a thread has joined. >>> This makes TestTerminatedThread fail. >>> >>> This change gives the check a few seconds to wait until the thread >> disappears. >>> Please review. >>> http://cr.openjdk.java.net/~goetz/wr18/8211931- >> terminatedThrd/01/test/hotspot/jtreg/runtime/jni/terminatedThread/TestT >> erminatedThread.java.udiff.html >>> >>> Best regards, >>> Goetz. >>>
- Previous message: RFR(S): 8211932: [ppc][testbug] runtime/jni/terminatedThread/TestTerminatedThread.java fails as threads don't terminate immediately
- Next message: RFR(S): 8211932: [ppc][testbug] runtime/jni/terminatedThread/TestTerminatedThread.java fails as threads don't terminate immediately
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]