RFR(S): 8211932: [ppc][testbug] runtime/jni/terminatedThread/TestTerminatedThread.java fails as threads don't terminate immediately (original) (raw)
Lindenmaier, Goetz goetz.lindenmaier at sap.com
Fri Oct 12 06:28:49 UTC 2018
- Previous message: RFR(S): 8211932: [ppc][testbug] runtime/jni/terminatedThread/TestTerminatedThread.java fails as threads don't terminate immediately
- Next message: RFR(S): 8211932: [ppc][testbug] runtime/jni/terminatedThread/TestTerminatedThread.java fails as threads don't terminate immediately
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Martin,
thanks for the review!
Best regards, Goetz.
-----Original Message----- From: Doerr, Martin Sent: Donnerstag, 11. Oktober 2018 16:37 To: David Holmes <david.holmes at oracle.com>; Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; hotspot-runtime-dev at openjdk.java.net Subject: RE: RFR(S): 8211932: [ppc][testbug] runtime/jni/terminatedThread/TestTerminatedThread.java fails as threads don't terminate immediately
Hi Götz, thanks for the fix. Looks good to me, too. It's a little unfortunate that we can't verify that the bean eventually returns - 1, but ok. The test still seems to fulfill its purpose, so I'm fine with it. Best regards, Martin
-----Original Message----- From: hotspot-runtime-dev <hotspot-runtime-dev-_ _bounces at openjdk.java.net> On Behalf Of David Holmes Sent: Donnerstag, 11. Oktober 2018 14:34 To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; hotspot-runtime- dev at openjdk.java.net Subject: Re: RFR(S): 8211932: [ppc][testbug] runtime/jni/terminatedThread/TestTerminatedThread.java fails as threads don't terminate immediately On 11/10/2018 10:12 PM, Lindenmaier, Goetz wrote: >> Something like that yes. :) > Can I consider this a review ? :)) Yes :) Thanks, David > Best regards, > Goetz. > >> -----Original Message----- >> From: David Holmes <david.holmes at oracle.com> >> Sent: Donnerstag, 11. Oktober 2018 13:53 >> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; hotspot-runtime- >> dev at openjdk.java.net >> Subject: Re: RFR(S): 8211932: [ppc][testbug] >> runtime/jni/terminatedThread/TestTerminatedThread.java fails as threads >> don't terminate immediately >> >> On 11/10/2018 9:23 PM, Lindenmaier, Goetz wrote: >>> Hi, >>> >>>> I may just have to kill off this part of the test >>> You mean we should skip the tests for -1? Like this: >>> http://cr.openjdk.java.net/~goetz/wr18/8211931- terminatedThrd/02/ >>> ? >> >> Something like that yes. :) The main thing this test is doing is >> ensuring we don't crash when we encounter these >> terminated-but-still-attached threads. >> >> Thanks, >> David >> >>> Best regards, >>> Goetz. >>> >>> >>>> -----Original Message----- >>>> From: David Holmes <david.holmes at oracle.com> >>>> Sent: Donnerstag, 11. Oktober 2018 08:03 >>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; hotspot- runtime- >>>> dev at openjdk.java.net >>>> Subject: Re: RFR(S): 8211932: [ppc][testbug] >>>> runtime/jni/terminatedThread/TestTerminatedThread.java fails as >> threads >>>> don't terminate immediately >>>> >>>> Hi Goetz, >>>> >>>> On 11/10/2018 1:01 AM, Lindenmaier, Goetz wrote: >>>>> Hi David, >>>>> >>>>> I implemented your little experiment, and did 4 runs with my fix. >>>>> I copied you the relevant output here: >>>>> http://cr.openjdk.java.net/~goetz/wr18/8211931- >>>> terminatedThrd/01/withmyfix.txt >>>>> >>>>> Your code completes in one loop. >>>>> From my output you can see that the CPU time is increasing a little, but >>>>> after 3-4 iterations the thread goes away. >>>>> >>>>> I also did 4 runs without my fix: >>>>> http://cr.openjdk.java.net/~goetz/wr18/8211931- >>>> terminatedThrd/01/withoutmyfix.txt >>>>> I got 3 failures, one pass. >>>>> Also here, your code completes in one loop. >>>> >>>> Many thanks for doing that. This is so perplexing. While adding the >>>> extra loops as per your fix may have solved your problem, it will make >>>> the recycled thread-id problem that we have seen in stress testing even >>>> more likely. >>>> >>>> I may just have to kill off this part of the test. It's wasting too many >>>> cycles just to try and check we are graceful went encountering a >>>> terminated unattached thread. >>>> >>>> Thanks, >>>> David >>>> >>>>> Best regards, >>>>> Goetz. >>>>> >>>>> >>>>>> -----Original Message----- >>>>>> From: David Holmes <david.holmes at oracle.com> >>>>>> Sent: Mittwoch, 10. Oktober 2018 14:32 >>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; hotspot- >> runtime- >>>>>> dev at openjdk.java.net >>>>>> Subject: Re: RFR(S): 8211932: [ppc][testbug] >>>>>> runtime/jni/terminatedThread/TestTerminatedThread.java fails as >>>> threads >>>>>> don't terminate immediately >>>>>> >>>>>> Hi Goetz, >>>>>> >>>>>> On 10/10/2018 8:25 PM, Lindenmaier, Goetz wrote: >>>>>>> Hi David, >>>>>>> >>>>>>> This failure is very well reproducible, but only on linuxppc64 and >>>>>> linuxppc64le. >>>>>> >>>>>> That doesn't really make sense to me. I would not expect the >>>>>> process/thread lifecycle management code to be different based on >> the >>>>>> CPU involved. This should be a simple kernel + NPTL/libc issue. >>>>>> >>>>>>> I implemented this fix in July, just missed the RDP, and the patch is >> used >>>>>>> in our nightly builds since then. Since that date I don't see a single >>>>>>> failure. We run these nightly tests with the fastdebug build, though. >>>>>>> But linuxx8664, linuxs390x don't show the issue, nor all the other >>>>>>> platforms. As there is no special high load, and because it's that >>>>>>> well reproducible, I don't think I read the information of a thread of >>>>>> another >>>>>>> process with the same thread id. >>>>>>> With the output I implemented in the test, I see that the cpu time >> keeps >>>>>>> increasing a bit, then it's stable for a few iterations, and then -1. >>>>>> >>>>>> That can also be explained by a thread-id being recycled and then the >>>>>> new thread also terminating. Granted the timing and reproducibility >>>>>> makes that unlikely. >>>>>> >>>>>> This is quite bizarre and I don't like bizarre. :) >>>>>> >>>>>> Are you able to apply this patch to the test and run some tests on ppc? >>>>>> >>>>>> if ((res = pthreadjoin(thread, NULL)) != 0) { >>>>>> fprintf(stderr, "TEST ERROR: pthreadjoin failed: %s (%d)\n", >>>>>> strerror(res), res); >>>>>> exit(1); >>>>>> } >>>>>> >>>>>> + while (pthreadkill(thread, 0) == 0) { >>>>>> + res++; >>>>>> + } >>>>>> + printf("Native thread was gone after %d iterations\n", res); >>>>>> return nativeThread; >>>>>> } >>>>>> >>>>>> Once pthreadkill gives ESRCH then so should >> pthreadgetcpuclockid(). >>>>>> At least until the thread-id is recycled. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> Best regards, >>>>>>> Goetz. >>>>>>> >>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: David Holmes <david.holmes at oracle.com> >>>>>>>> Sent: Mittwoch, 10. Oktober 2018 01:22 >>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; hotspot- >>>> runtime- >>>>>>>> dev at openjdk.java.net >>>>>>>> Subject: Re: RFR(S): 8211932: [ppc][testbug] >>>>>>>> runtime/jni/terminatedThread/TestTerminatedThread.java fails as >>>>>> threads >>>>>>>> don't terminate immediately >>>>>>>> >>>>>>>> Hi Goetz, >>>>>>>> >>>>>>>> There is already an open bug for this issue - JDK-8208159 - but it has >>>>>>>> only reproduced in a stress environment where we think thread- id's >>>> are >>>>>>>> being recycled (which means waiting longer won't help). This should >> be >>>>>>>> OS not CPU specific so I'm very interested to know in what >>>> circumstances >>>>>>>> you see this failure. >>>>>>>> >>>>>>>> I created an instrumented version of the test that did a pthreadkill >> on >>>>>>>> the target to check for ESRCH - which it got - yet we still see failures >>>>>>>> in those stress environments. >>>>>>>> >>>>>>>> David >>>>>>>> >>>>>>>> On 10/10/2018 1:10 AM, Lindenmaier, Goetz wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> On ppc, one still sees increasing thread cpu times after a thread has >>>>>> joined. >>>>>>>>> This makes TestTerminatedThread fail. >>>>>>>>> >>>>>>>>> This change gives the check a few seconds to wait until the thread >>>>>>>> disappears. >>>>>>>>> Please review. >>>>>>>>> http://cr.openjdk.java.net/~goetz/wr18/8211931- >>>>>>>> >>>>>> >>>> >> terminatedThrd/01/test/hotspot/jtreg/runtime/jni/terminatedThread/TestT >>>>>>>> erminatedThread.java.udiff.html >>>>>>>>> >>>>>>>>> Best regards, >>>>>>>>> Goetz. >>>>>>>>>
- Previous message: RFR(S): 8211932: [ppc][testbug] runtime/jni/terminatedThread/TestTerminatedThread.java fails as threads don't terminate immediately
- Next message: RFR(S): 8211932: [ppc][testbug] runtime/jni/terminatedThread/TestTerminatedThread.java fails as threads don't terminate immediately
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]