RFR (XS) 8047212: fix race between ObjectMonitor alloc and verification code (original) (raw)

Daniel D. Daugherty daniel.daugherty at oracle.com
Thu Oct 22 13:40:56 UTC 2015


On 10/21/15, 5:42 PM, Coleen Phillimore wrote:

Dan, I looked at this code and the bug description. This looks good!

Thanks! And thanks for the review!

I like the symmetry of loadpointeracquire/storerelease and that all loads of gBlockList are loaded with this. I agree with you that the uses at safepoint should follow the same pattern.

Thanks go to Carsten V for reminding me about the need to do matching acquire ops when you use release.

I assume that once you get the gBlockList then traversing the elements do not need synchronization with the loads (of the next pointers).

Correct. New blocks of ObjectMonitors are always prefixed onto the beginning of the chain and we never release the blocks; they are Type Stable Memory (TSM).

I think the vmStructs changes are fine. You got it to compile and checked the SA code, that is all you need to do. ]

Thanks. Should be much easier the next time someone needs to add a static ptr volatile field...

Congratulations! for getting to the bottom of this troubling bug. This is a really good find!

Thanks! It's been a long hunt... All I needed was for the debug code to kick in during one failure... And that finally happened...

Dan

Coleen On 10/20/15 5:15 PM, Daniel D. Daugherty wrote: Greetings,

I've updated the fix based on feedback from Carsten V and David H. Webrev URL: http://cr.openjdk.java.net/~dcubed/8047212-webrev/1-jdk9-hs-rt/ Changes relative to round 0: - only src/share/vm/runtime/synchronizer.cpp has changed - reads of gBlockList now use OrderAccess::loadptracquire() code style cleanups: - only cleaned up the functions that I touched to make the OrderAccess::loadptracquire() changes - changed implied booleans into real boolean expressions - moved some locals to narrower context - added/removed some blank lines - made casts consistent with the majority style in this file I'm repeating all of the same testing that I did for round 0. The round 1 bits have not yet made it through JPRT-west, but the jobs are mostly done. Thanks, in advance, for any comments, questions or suggestions. Dan

On 10/19/15, 9:02 PM, Daniel D. Daugherty wrote: Greetings, I have a fix for a long standing race between the lock-free ObjectMonitor verification code and the normal (locked) ObjectMonitor block allocation code path. For this fix, I would like at least a Runtime team reviewer and a Serviceability team reviewer. Thanks! JDK-8047212 runtime/ParallelClassLoading/bootstrap/random/inner-complex assert(ObjectSynchronizer::verifyobjmonisinpool(inf)) failed: monitor is invalid https://bugs.openjdk.java.net/browse/JDK-8047212 Webrev URL: http://cr.openjdk.java.net/~dcubed/8047212-webrev/0-jdk9-hs-rt/ Testing: Aurora Adhoc RT-SVC nightly batch 4 inner-complex fastdebug parallel runs for 4+ days and 600K iterations without seeing this failure; the experiment is still running; final results to be reported at the end of the review cycle JPRT -testset hotspot This fix: - makes ObjectMonitor::gBlockList volatile - uses "OrderAccess::releasestoreptr(&gBlockList, temp)" to make sure the new block updates happen before gBlockList is changed to refer to the new block - add SA support for a "static pointer volatile" field like: static ObjectMonitor * volatile gBlockList; See the following link for a nice description of what "volatile" means in the different positions on a variable/parameter decl line: http://www.embedded.com/electronics-blogs/beginner-s-corner/4023801/Introduction-to-the-Volatile-Keyword Thanks, in advance, for any comments, questions or suggestions. Dan



More information about the hotspot-runtime-dev mailing list