(Preliminary) RFC 7038914: VM could throw uncaught OOME in ReferenceHandler thread (original) (raw)
Thomas Schatzl thomas.schatzl at oracle.com
Tue Apr 30 14:57:20 UTC 2013
- Previous message: hg: jdk8/tl/jdk: 8007373: Inet6Address serialization incompatibility
- Next message: (Preliminary) RFC 7038914: VM could throw uncaught OOME in ReferenceHandler thread
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi all,
the webrev at http://cr.openjdk.java.net/~tschatzl/7038914/webrev/ presents a first stab at the CR "7038914: VM could throw uncaught OOME in ReferenceHandler thread".
The problem is that under very heavy memory pressure, there is the reference handler throws an exception with the message "Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "Reference Handler".
The change improves handling of out-of-memory conditions in the ReferenceHandler thread. Instead of crashing the thread, and then disabling reference processing, it catches this exception and continues.
I'd like to discuss the change as I'm not really familiar with JDK coding style, handling of such situations and have some questions about it.
Bugs.sun http://bugs.sun.com/view_bug.do?bug_id=7038914
JBS: https://jbs.oracle.com/bugs/browse/JDK-7038914
Proposed webrev: http://cr.openjdk.java.net/~tschatzl/7038914/webrev/
first, I could not reliably reproduce the issue using the information in the CR. Only via code review (and an idea from Bengt Rutisson - thanks!) I implemented a nice way to reproduce an OOME in the reference handler. This involves implementing a custom java.lang.ref.ReferenceQueue and overriding the enqueue() method, and doing some allocation that causes an OOME within that method. My current theory is that synchronization/locking allocates some objects on the java heap, which are very small, so an OOME in that thread can be caused. I walked the locking code, but could not find a java heap allocation there (ObjectMonitor seems to be a C heap object) - maybe I overlooked it. Probably somebody else knows? It cannot be the invocation of the Cleaner.clean() methods above the enqueuing since it has it's own try-catch block already. Anyway, since the reproducer I wrote shows the same symptoms as reported in the CR, I hope that this test case is sufficient to be regarded as a reproducer and the change as a fix.
the actual change in java/lang/ref/Reference as mentioned involves putting the entire main enqueuing procedure within a try-catch block. It only catches OOME to decrease the possibility to catch anything that should not be caught. The problem is that this fix does not (and cannot) really fix bad programming in anyone overriding java.lang.ref.ReferenceQueue.enqueue(), i.e. if the OOME condition is before the actual execution of the original enqueue() method, i.e. corruption of the queue may be still possible. On the other hand, since overriding ReferenceQueue.enqueue() requires putting the custom ReferenceQueue into the boot class path, I assume that people doing that are aware of possible issues.
handling the OOME: in the catch block of the I put a block
// avoid crashing the reference handler thread, // but provide for some diagnosability assert false : e.toString();
to provide some diagnosability in the case of an exception (when running with assertions). I copied that from other code that tries to catch similar problems in the clean() method of the Cleaners. There are other variants of managing this in the jdk, some involving calling system.exit(). I thought that was too drastic, so I didn't do that, but what is the appropriate way to handle this situation?
if the use of locks or the synchronization keyword is indeed the problem, I think it is possible to use nonblocking synchronization that is known to not allocate any memory for managing the reference queues instead. However I think to guard against misbehaving ReferenceQueue implementations you'd still want to have a try-catch block here.
is the location of the test correct? I.e. in the jdk test/java/lang/ref directory? Or is the correct place for that the hotspot test directories?
Since this is (seems to be) a JDK only change, and this is my first time changing the JDK, I hope core-libs-dev is the right mailing list. Otherwise please direct me to the the appropriate one.
Thanks, Thomas
- Previous message: hg: jdk8/tl/jdk: 8007373: Inet6Address serialization incompatibility
- Next message: (Preliminary) RFC 7038914: VM could throw uncaught OOME in ReferenceHandler thread
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]