RFR(M,v3): JDK-8059036 : Implement Diagnostic Commands for heap and finalizerinfo (original) (raw)

Derek White derek.white at oracle.com
Fri May 15 22:46:32 UTC 2015


Hi Peter,

Some comments below...

On 5/14/15 3:50 AM, Peter Levart wrote:

Hi Derek,

On 05/13/2015 10:34 PM, Derek White wrote: Hi Peter,

I don't have smoking gun evidence that your change introduces a bug, but I have some concerns... I did have a concern too, ...

On 5/12/15 6:05 PM, Peter Levart wrote: Hi Dmitry,

You iterate the queue then, not the unfinalized list. That's more logical. Holding the queue's lock may pause reference handler and finalizer threads for the entire time of iteration. This can blow up the application. Suppose one wants to diagnose the application because he suspects that finalizer thread hardly keeps up with production of finalizable instances. This can happen if the allocation rate of finalizable objects is very high and/or finalize() methods are not coded to be fast enough. Suppose the queue of Finalizer(s) builds up gradually and is already holding several million objects. Now you initiate the diagnostic command to print the queue. The iteration over and grouping/counting of several millions of Finalizer(s) takes some time. This blocks finalizer thread and reference handler thread. But does not block the threads that allocate finalizable objects. By the time the iteration is over, the JVM blows up and application slows down to a halt doing GCs most of the time, getting OOMEs, etc... It is possible to iterate the elements of the queue for diagnostic purposes without holding a lock. The modification required to do that is the following (havent tested this, but I think it should work): http://cr.openjdk.java.net/~plevart/misc/Finalizer.printFinalizationQueue/webrev.01/ One issue to watch out for is the garbage collectors inspect the Reference.next field from C code directly (to distinguish active vs. pending, iterate over oops, etc.). You can search hotspot for javalangrefReference::next*, javalangrefReference::setnext*. Your change makes "inactive" References superficially look like "enqueued" References. The GC code is rather subtle and I haven't yet seen a case where it would get confused by this change, but there could easily be something like that lurking in the GC code. ...but then I thought that GC can in no way treat a Reference differently whether it is still enqueued in a ReferenceQueue or already dequeued (inactive) - responsibility is already on the Java side. Currently the definition of Reference.next is this: /* When active: NULL * pending: this * Enqueued: next reference in queue (or this if last) * Inactive: this */ @SuppressWarnings("rawtypes") Reference next; We see that, unless GC inspects all ReferenceQueue instances and scans their lists too, the state of a Reference that is enqueued as last in list is indistinguishable from the state of inactive Reference. So I deduced that this distinction (enqueued/inactive) can't be established solely on the value of .next field ( == this or != this)... But I should inspect the GC code too to build a better understanding of that part of the story... Anyway. It turns out that there is already enough state in Reference to destinguish between it being enqueued as last in list and already dequeued (inactive) - the additional state is Reference.queue that transitions from ReferenceQueue.ENQUEUED -> ReferenceQueue.NULL in ReferenceQueue.reallyPoll. We need to just make sure the two fields (r.next and r.queue) are assigned and read in correct order. This can be achieved either by making Reference.next a volatile field or by a couple of explicit fences: http://cr.openjdk.java.net/~plevart/misc/Finalizer.printFinalizationQueue/webrev.02/ The assignment of r.queue to ReferenceQueue.ENQUEUED already happens before linking the reference into the queue's head in ReferenceQueue.enqueue(): r.queue = ENQUEUED; r.next = (head == null) ? r : head; head = r; Both stores are volatile.

This sounds a bit better. I don't have a good handle on the pros and cons of a volatile field over explicit fences via Unsafe in this case. Kim might jump in soon.

I'd like to suggest an entirely less-clever solution. Since the problem is that forEach() might see inconsistent values for the queue and next fields of a Reference, but we don't want to grab a lock walking the whole queue, we could instead grab the lock as we look at each Reference (and do the "back-up" trick if we were interfered with). Of course this is slower than going lock-free, but it's not any more overhead than the ReferenceHandler thread or FinalizerThreads incur.

The only benefit of this solution over the others is that it is less clever, and I'm not sure how much cleverness this problem deserves. Given that, and ranking the solutions as "lock" < (clever) "volatile" < "fence", I'd be happier with the "volatile" or "lock" solution if that is sufficient.



More information about the core-libs-dev mailing list