RFR (XS) 8047212: fix race between ObjectMonitor alloc and verification code (original) (raw)

Daniel D. Daugherty daniel.daugherty at oracle.com
Thu Oct 22 15:35:43 UTC 2015


Thanks for the re-review!

Current stress test results:

$ grep -v PASS doit_loop.fast_?.log doit_loop.fast_0.log:Copy fast_0: loop #271077... doit_loop.fast_1.log:Copy fast_1: loop #271178... doit_loop.fast_2.log:Copy fast_2: loop #271217... doit_loop.fast_3.log:Copy fast_3: loop #271223...

$ elapsed_times mark.start_test_run doit_loop.fast_0.log mark.start_test_run 0 seconds doit_loop.fast_0.log 1 days 19 hours 30 minutes 39 seconds

The typical failure rate for this bug is 1-3 failures in 3 days with some other failure modes (in C2 or G1) popping in for a visit.

So far... no failures...

Dan

On 10/22/15, 9:30 AM, Carsten Varming wrote:

Dear Dan,

I reviewed round 1. Looks good to me. Thank you for the updated webbrew. Carsten On Tue, Oct 20, 2015 at 2:15 PM, Daniel D. Daugherty <daniel.daugherty at oracle.com <mailto:daniel.daugherty at oracle.com>> wrote: Greetings, I've updated the fix based on feedback from Carsten V and David H. Webrev URL: http://cr.openjdk.java.net/~dcubed/8047212-webrev/1-jdk9-hs-rt/ <http://cr.openjdk.java.net/%7Edcubed/8047212-webrev/1-jdk9-hs-rt/> Changes relative to round 0: - only src/share/vm/runtime/synchronizer.cpp has changed - reads of gBlockList now use OrderAccess::loadptracquire() code style cleanups: - only cleaned up the functions that I touched to make the OrderAccess::loadptracquire() changes - changed implied booleans into real boolean expressions - moved some locals to narrower context - added/removed some blank lines - made casts consistent with the majority style in this file I'm repeating all of the same testing that I did for round 0. The round 1 bits have not yet made it through JPRT-west, but the jobs are mostly done. Thanks, in advance, for any comments, questions or suggestions. Dan

On 10/19/15, 9:02 PM, Daniel D. Daugherty wrote: Greetings, I have a fix for a long standing race between the lock-free ObjectMonitor verification code and the normal (locked) ObjectMonitor block allocation code path. For this fix, I would like at least a Runtime team reviewer and a Serviceability team reviewer. Thanks! JDK-8047212 runtime/ParallelClassLoading/bootstrap/random/inner-complex assert(ObjectSynchronizer::verifyobjmonisinpool(inf)) failed: monitor is invalid https://bugs.openjdk.java.net/browse/JDK-8047212 Webrev URL: http://cr.openjdk.java.net/~dcubed/8047212-webrev/0-jdk9-hs-rt/ <http://cr.openjdk.java.net/%7Edcubed/8047212-webrev/0-jdk9-hs-rt/> Testing: Aurora Adhoc RT-SVC nightly batch 4 inner-complex fastdebug parallel runs for 4+ days and 600K iterations without seeing this failure; the experiment is still running; final results to be reported at the end of the review cycle JPRT -testset hotspot This fix: - makes ObjectMonitor::gBlockList volatile - uses "OrderAccess::releasestoreptr(&gBlockList, temp)" to make sure the new block updates happen before gBlockList is changed to refer to the new block - add SA support for a "static pointer volatile" field like: static ObjectMonitor * volatile gBlockList; See the following link for a nice description of what "volatile" means in the different positions on a variable/parameter decl line: http://www.embedded.com/electronics-blogs/beginner-s-corner/4023801/Introduction-to-the-Volatile-Keyword Thanks, in advance, for any comments, questions or suggestions. Dan



More information about the hotspot-runtime-dev mailing list