RFR (XS) 8047212: fix race between ObjectMonitor alloc and verification code (original) (raw)

David Holmes david.holmes at oracle.com
Wed Oct 21 03:33:49 UTC 2015


On 21/10/2015 1:30 PM, Daniel D. Daugherty wrote:

On 10/20/15, 9:24 PM, David Holmes wrote:

On 21/10/2015 12:51 PM, Daniel D. Daugherty wrote:

Thanks for the quick review!

On 10/20/15, 8:34 PM, David Holmes wrote: On 21/10/2015 10:43 AM, serguei.spitsyn at oracle.com wrote: Dan,

Nice finding! The fixes look good to me. The changes in the oopsdo() and deflateidlemonitors() are not necessary as these functions are called in VMOperation's. But some unification with OrderAccess::loadptracquire() does not harm either. I was thinking on the same lines: that reading the variable at a safepoint should already be safe, but that consistently using loadacquireptr is harmless and stylistically a 'Good Thing". Good. We're on the same page here... The other thought I had was perhaps making the verification routines (which are non-product) use the lock as well. :) If the verification routine had used the lock, then we wouldn't have spotted this unsafe use of the lock-free algorithm in this context. The lock-free calls to ObjectSynchronizer::monitorsiterate() would simply miss everything other than the new ObjectMonitor block on occasion; not an easy thing to diagnose unless you are looking for something specific on the list that you know should be there... AFAICS monitorsiterate() is either called at a safepoint or else with the lock held. Exactly which call to monitorsiterate exposed this problem? I could swear I ran across a monitorsiterate() call from JVM/TI that didn't go to a safepoint because the target thread was suspended... I have a vague memory that JVM/TI uses a safepoint-op when the target thread is not suspended and goes direct when the target thread is suspended...

My bad - you are right, it skips the safepoint when the target is suspended.

David

Dan

Aside: did you check that all the other global monitor variables are accessed correctly - ie either always with the lock held or using OrderAccess operations if lock-free? Nothing more than a cursory look. I have a deflateidlemonitors bug on my plate also so I'll be looking at this code even more... Ok. Thanks, David I have still have the query about the vmStructs changes but otherwise thumbs up. My reply to that query crossed in the ether with this review... Dan

Thanks, David Thanks, Serguei

On 10/20/15 14:15, Daniel D. Daugherty wrote: Greetings, I've updated the fix based on feedback from Carsten V and David H. Webrev URL: http://cr.openjdk.java.net/~dcubed/8047212-webrev/1-jdk9-hs-rt/ Changes relative to round 0: - only src/share/vm/runtime/synchronizer.cpp has changed - reads of gBlockList now use OrderAccess::loadptracquire() code style cleanups: - only cleaned up the functions that I touched to make the OrderAccess::loadptracquire() changes - changed implied booleans into real boolean expressions - moved some locals to narrower context - added/removed some blank lines - made casts consistent with the majority style in this file I'm repeating all of the same testing that I did for round 0. The round 1 bits have not yet made it through JPRT-west, but the jobs are mostly done. Thanks, in advance, for any comments, questions or suggestions. Dan

On 10/19/15, 9:02 PM, Daniel D. Daugherty wrote: Greetings, I have a fix for a long standing race between the lock-free ObjectMonitor verification code and the normal (locked) ObjectMonitor block allocation code path. For this fix, I would like at least a Runtime team reviewer and a Serviceability team reviewer. Thanks! JDK-8047212 runtime/ParallelClassLoading/bootstrap/random/inner-complex assert(ObjectSynchronizer::verifyobjmonisinpool(inf)) failed: monitor is invalid https://bugs.openjdk.java.net/browse/JDK-8047212 Webrev URL: http://cr.openjdk.java.net/~dcubed/8047212-webrev/0-jdk9-hs-rt/ Testing: Aurora Adhoc RT-SVC nightly batch 4 inner-complex fastdebug parallel runs for 4+ days and 600K iterations without seeing this failure; the experiment is still running; final results to be reported at the end of the review cycle JPRT -testset hotspot This fix: - makes ObjectMonitor::gBlockList volatile - uses "OrderAccess::releasestoreptr(&gBlockList, temp)" to make sure the new block updates happen before gBlockList is changed to refer to the new block - add SA support for a "static pointer volatile" field like: static ObjectMonitor * volatile gBlockList; See the following link for a nice description of what "volatile" means in the different positions on a variable/parameter decl line: http://www.embedded.com/electronics-blogs/beginner-s-corner/4023801/Introduction-to-the-Volatile-Keyword Thanks, in advance, for any comments, questions or suggestions. Dan



More information about the hotspot-runtime-dev mailing list