RFR: 8168914: Crash in ClassLoaderData/JNIHandleBlock::oops_do during concurrent marking (original) (raw)

coleen.phillimore at oracle.com coleen.phillimore at oracle.com
Wed Feb 15 20:46:35 UTC 2017


On 2/15/17 10:07 AM, Erik Helin wrote:

On 02/15/2017 02:48 AM, David Holmes wrote:

Hi Erik, Hi David, thanks for having a look! Please see new patches at: - incremental: http://cr.openjdk.java.net/~ehelin/8168914/00-01/ - full: http://cr.openjdk.java.net/~ehelin/8168914/01/

oopsdo can miss an oop that is in the process of being added - is that a problem? Nope, that is not an issue. Even if addhandle and oopsdo where synchronized with each other (for example by using lock), you can still have the scenario where a GC thread first runs (and finishes) oopsdo and then addhandle is called (a linearized execution). This scenario is something the concurrent marking algorithms already should be prepared to handle. 140 if (c->data[i] != NULL) { 141 f->dooop(&c->data[i]); 142 }

Given a slot can be nulled out concurrently at any time, is it worth does this NULL check? The OopClosure will have to do its own NULL check anyway. Sure, this was mostly to be a bit consistent with JNIHandleBlock (it does the same NULL check). Think of it as an optimization, there is no reason to call f->dooop if c->data[i] == NULL. Do you think it reads better if the NULL check is removed? I don't have a strong opinion on this one. 144 c = (Chunk*) OrderAccess::loadptracquire((volatile intptrt*)&c->next); This doesn't need to be a load-acquire. You already loaded 'c' via load-acquire of head (or chained therefrom) and its next field is set prior to the setting of the head that you read. Agree. I just discussed this with Erik Ö as well, and we all agree on this. I also removed the volatile specifier for next. 624 jobject ClassLoaderData::addhandle(Handle h) { 625 MutexLockerEx ml(metaspacelock(), Mutex::nosafepointcheckflag); 626 return (jobject) handles.add(h()); 627 } 628 629 void ClassLoaderData::removehandleunsafe(jobject h) { 630 assert(handles.contains((oop*) h), "Got unexpected handle " PTRFORMAT, p2i((oop*) h)); 631 ((oop) h) = NULL; 632 } I'm a bit unclear on the typing here. Previously we use a JNI Handle which was a jobject. But now we've simply stored an oop into a slot in an array. We pass the address of that slot back as a jobject, but is it really? So, this is a bit confusing, and I discussed this with Coleen a few days ago. jobject was used originally used because that is what JNIHandleBlock::allocatehandle returns. Using something other than jobject means updating ModuleEntry and in particular users ModuleEntry::pd. This is not something we are keen on doing right now for JDK 9, and Coleen is planning to clean up how the runtime handles oops for 10 (or at least improve the situation).

Yes, these are not really jobject if you think of jobject and jweak as being native handles to oops. These really are indirect pointers to oops that we use in the runtime.

We're still discussing how we want these to look for JDK10. Your suggestions are welcome - we'll include you on some email as we discuss this (and openjdk also).

I don't know what jobject is meant to represent, in the code it is just an empty class (an opaque type). ClassLoaderData::addhandle could just as well return an oop*, but AFAIU we try to avoid oops and oop* in the runtime code (because that might mean that some code forgot to tell the GC about the oop). Maybe I'm wrong here?

You are right. We don't want to have oop* in the runtime code because it also is unclear if it was supposed to be compressed or not. We want a special name for these.

Thanks, Coleen

> I would also expect contains to take an oop not an oop* - it > seems to be asking "is this an address of a slot in our > ChunkedHandleList" rather than asking "is this oop in our > ChunkedHandleList". ?? I actually wanted the assert to check whether the passed jobject (which really is an oop*) actually originated from a call to ClassLoaderData::addhandle. I don't want any code to pass a jobject to ClassLoaderData::removehandleunsafe that doesn't come from a call to ClassLoaderData::addhandle.

A few minor comments:

Copyrights need updating. Aaah, it is that time of the year again. Fixed. src/share/vm/classfile/classLoaderData.hpp 177 // Only on thread ... Typo: on -> one Thanks, fixed. src/share/vm/classfile/moduleEntry.cpp 90 // Create a JNI handle for the shared ProtectionDomain and save it atomically. 91 // If someone beats us setting the pd cache, the created JNI handle is destroyed. These are no longer JNI Handles. Thanks, I updated the comment. Thanks, Erik Thanks, David -----

Testing: - Tier 1 (aka JPRT), 2, 3, 4, 5.

I would appreciate quite a few reviews on this patch, it contains a nice mixture of: - me venturing into the runtime code :) - lock free code - unreproducible bug (even though I'm sure of the root cause) Thanks for everyone participating in the discussions around this bug and potential solutions: Volker, Coleen, Mikael G, StefanK, Erik Ö and Jiangli. Thanks! Erik [0]: https://bugs.openjdk.java.net/browse/JDK-8168914



More information about the hotspot-dev mailing list