RFR(XL): 8185640: Thread-local handshakes (original) (raw)

Doerr, Martin martin.doerr at sap.com
Tue Nov 7 17:14:27 UTC 2017


Hi Coleen,

sorry, I had mixed up the two safepoint path in the interpreter. I had thought we'd jump to the safept_entry which performs the call_VM to InterpreterRuntime::at_safepoint and then starts dispatching again. This approach would re-execute the current bytecode. But we're calling InterpreterRuntime::at_safepoint directly which simply returns after the safepoint. So I'm ok with what you have proposed. Thanks.

Best regards, Martin

-----Original Message----- From: coleen.phillimore at oracle.com [mailto:coleen.phillimore at oracle.com] Sent: Dienstag, 7. November 2017 17:51 To: Doerr, Martin <martin.doerr at sap.com>; Robbin Ehn <robbin.ehn at oracle.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net> Cc: Nils Eliasson <nils.eliasson at oracle.com>; Karen Kinnear <karen.kinnear at oracle.com>; Andrew Haley <aph at redhat.com>; David Holmes <david.holmes at oracle.com> Subject: Re: RFR(XL): 8185640: Thread-local handshakes

On 11/7/17 10:08 AM, Doerr, Martin wrote:

Hi Coleen,

my point was not related to deoptimization. Maybe I was not clear enough. It could happen with -Xint (no deoptimization at all). 1. Java thread executes registerfinalizer 2. Safepoint poll in Bytecodes::returnregisterfinalizer bytecode gets hit 3. VM reexecutes the bytecode after safepoint (bcp still points to the Bytecodes::returnregisterfinalizer bytecode)

This is my confusion then.  Why would the TemplateTable::_return_register_finalizer bytecode be reexecuted because of the safepoint if not for deoptimization?   I expect it to continue at the point in the template after the call_VM and not reregister the finalizer.

Btw. the compilers generate the poll after popping the frame, so doing it after removeactivation would be closer to that. But I don't want to propose this, either. I share the opinion that this may be messy.

Agree. Coleen

Best regards, Martin

-----Original Message----- From: coleen.phillimore at oracle.com [mailto:coleen.phillimore at oracle.com] Sent: Dienstag, 7. November 2017 15:54 To: Doerr, Martin <martin.doerr at sap.com>; Robbin Ehn <robbin.ehn at oracle.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net> Cc: Nils Eliasson <nils.eliasson at oracle.com>; Karen Kinnear <karen.kinnear at oracle.com>; Andrew Haley <aph at redhat.com>; David Holmes <david.holmes at oracle.com> Subject: Re: RFR(XL): 8185640: Thread-local handshakes

On 11/7/17 9:04 AM, Doerr, Martin wrote: Hi Coleen,

The TemplateTable::returnregisterfinalizer bytecode wouldn't get run twice with deoptimization because TemplateInterpreter::deoptreexecuteentry() will reexecute the returnregisterfinalizer as a return(vtos) bytecode. Very tricky indeed Probably not with deoptimization, but what about the following scenario: 1. Java thread executes registerfinalizer 2. Safepoint poll in Bytecodes::returnregisterfinalizer bytecode gets hit 3. VM reexecutes the bytecode after safepoint (bcp still points to the Bytecodes::returnregisterfinalizer bytecode) This doesn't sound good to me. The deoptimization only happens when the compiled frame is on the top of the stack, not TemplateTable::return(), ie the interpreter, and does not deoptimize for the safepoint poll in the compiled frame. See frame::shouldbedeoptimized, and canbedeoptimized (not sure the difference between the two). If the deopt happens for the returnregisterfinalizer call, it reruns the bytecode in the interpreter as Bytecodes::return(vtos) rather than returnregisterfinalizer, so that the registration doesn't happen twice.  See TemplateInterpreter::deoptreexecuteentry(). This is what I've learned in the past couple of days!   It's very subtle, and Dean filed an RFE https://bugs.openjdk.java.net/browse/JDK-8190817 to hopefully help make this clearer.

I still would like to see the safepoint poll in TemplateTable::return after the call to returnregisterfinalizer since that's the order that the compiled code does it. So that means, you'd expect the poll to be done after removeactivation. I think this may be more tricky to implement because we don't know where we return to (not necessarily an interpreted method). Right? Sorry, no, I'd like the polll to be after returnregisterfinalizer call in TemplateTable::return.  removeactivation is a messy thing. Thanks, Coleen Best regards, Martin -----Original Message----- From: coleen.phillimore at oracle.com [mailto:coleen.phillimore at oracle.com] Sent: Dienstag, 7. November 2017 14:46 To: Doerr, Martin <martin.doerr at sap.com>; Robbin Ehn <robbin.ehn at oracle.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net> Cc: Nils Eliasson <nils.eliasson at oracle.com>; Karen Kinnear <karen.kinnear at oracle.com>; Andrew Haley <aph at redhat.com>; David Holmes <david.holmes at oracle.com> Subject: Re: RFR(XL): 8185640: Thread-local handshakes

On 11/7/17 7:04 AM, Doerr, Martin wrote: Hi Robbin, first of all, sorry that my proposal caused much more work on your side than expected. I still appreciate that you're doing it. Thanks. I was not aware of this checking code on x86 in debug build. But it looks feasible to me to move the safepoint check after the lastsp handling. However, it looks like registerfinalizer could get executed twice with the new implementation. So I'd prefer to generate the safepoint poll code only for the normal return bytecodes: if (SafepointMechanism::usesthreadlocalpoll() && desc->bytecode() != Bytecodes::returnregisterfinalizer) Bytecodes::returnregisterfinalizer is only used for returns from Object. constructor. I think we can live without safepoint poll there. I don't need to see a new webrev for this. The TemplateTable::returnregisterfinalizer bytecode wouldn't get run twice with deoptimization because TemplateInterpreter::deoptreexecuteentry() will reexecute the returnregisterfinalizer as a return(vtos) bytecode.   Very tricky indeed. Your suggested change would work also because callVM for returnregisterfinalizer will do a safepoint check on the JRTEND transition, so there'd be only one check here, which is probably fine. I still would like to see the safepoint poll in TemplateTable::return after the call to returnregisterfinalizer since that's the order that the compiled code does it. thanks, Coleen Best regards, Martin

-----Original Message----- From: Robbin Ehn [mailto:robbin.ehn at oracle.com] Sent: Dienstag, 7. November 2017 11:59 To: hotspot-dev developers <hotspot-dev at openjdk.java.net> Cc: Doerr, Martin <martin.doerr at sap.com>; Nils Eliasson <nils.eliasson at oracle.com>; Karen Kinnear <karen.kinnear at oracle.com>; Andrew Haley <aph at redhat.com>; coleen.phillimore at oracle.com; David Holmes <david.holmes at oracle.com> Subject: Re: RFR(XL): 8185640: Thread-local handshakes Hi all, First a bug have been found, when deopt happens in return, the return is re-executed and we hit an assert in the callVM because lastsp is now not NULL. After some discussion the proposed solution is to move the poll after the explicit reset of lastsp. (Re-execution is always vtos.) This is fixed in #15 changeset, #14 just some copyright year updates. (#9 changeset was dropped before it went on RFR, so it's not listed) The JEP will be targeted to JDK 10 Friday and integration will happen shortly after. For completeness all inc and a full (rebased on jdk/hs), all on CC I'm adding as reviewers. The code will be committed on: "8189941: Implementation JEP 312: Thread-local handshake" Tested tier 1-5, jprt, all tonga. Martin can you have quick look at #15 changeset? Thanks, Robbin SafepointMechanism-0 http://cr.openjdk.java.net/~rehn/8185640/v10/SafepointMechanism-0/webrev/ PollingPage-1 http://cr.openjdk.java.net/~rehn/8185640/v10/PollingPage-1/webrev/ Handshakes-2 http://cr.openjdk.java.net/~rehn/8185640/v10/Handshakes-2/webrev/ Atomic-Update-Rebase-3 http://cr.openjdk.java.net/~rehn/8185640/v10/Atomic-Update-Rebase-3/webrev/ Coleen-n-Test-Cleanup-4 http://cr.openjdk.java.net/~rehn/8185640/v10/Coleen-n-Test-Cleanup-4/webrev/ Assorted-Karen-5 http://cr.openjdk.java.net/~rehn/8185640/v10/Assorted-Karen-5/webrev/ Support-Check-Haley-6 http://cr.openjdk.java.net/~rehn/8185640/v10/Support-Check-Haley-6/webrev/ Interpreter-Poll-7 http://cr.openjdk.java.net/~rehn/8185640/v10/Interpreter-Poll-7/webrev/ Interpreter-Poll-WideRet-8 http://cr.openjdk.java.net/~rehn/8185640/v10/Interpreter-Poll-WideRet-8/webrev/ Interpreter-Poll-Switch-10 http://cr.openjdk.java.net/~rehn/8185640/v10/Interpreter-Poll-Switch-10/webrev/ Interpreter-Poll-Ret-11 http://cr.openjdk.java.net/~rehn/8185640/v10/Interpreter-Poll-Ret-11/webrev/ Option-Cleanup-12 http://cr.openjdk.java.net/~rehn/8185640/v10/Option-Cleanup-12/webrev/ DavidH-Option-Cleanup-13 http://cr.openjdk.java.net/~rehn/8185640/v10/DavidH-Option-Cleanup-13/webrev/ Copyright-Update-14 http://cr.openjdk.java.net/~rehn/8185640/v10/Copyright-Update-14/webrev/ Interpreter-Poll-Ret-Deopt-Fix-15 http://cr.openjdk.java.net/~rehn/8185640/v10/Interpreter-Poll-Ret-Deopt-Fix-15/webrev/ Full http://cr.openjdk.java.net/~rehn/8185640/v10/Full/webrev/ On 10/11/2017 03:37 PM, Robbin Ehn wrote: Hi all, Starting the review of the code while JEP work is still not completed. JEP: https://bugs.openjdk.java.net/browse/JDK-8185640 This JEP introduces a way to execute a callback on threads without performing a global VM safepoint. It makes it both possible and cheap to stop individual threads and not just all threads or none. Entire changeset: http://cr.openjdk.java.net/~rehn/8185640/v0/flat/ Divided into 3-parts, SafepointMechanism abstraction: http://cr.openjdk.java.net/~rehn/8185640/v0/SafepointMechanism-0/ Consolidating polling page allocation: http://cr.openjdk.java.net/~rehn/8185640/v0/PollingPage-1/ Handshakes: http://cr.openjdk.java.net/~rehn/8185640/v0/Handshakes-2/ A handshake operation is a callback that is executed for each JavaThread while that thread is in a safepoint safe state. The callback is executed either by the thread itself or by the VM thread while keeping the thread in a blocked state. The big difference between safepointing and handshaking is that the per thread operation will be performed on all threads as soon as possible and they will continue to execute as soon as it’s own operation is completed. If a JavaThread is known to be running, then a handshake can be performed with that single JavaThread as well. The current safepointing scheme is modified to perform an indirection through a per-thread pointer which will allow a single thread's execution to be forced to trap on the guard page. In order to force a thread to yield the VM updates the per-thread pointer for the corresponding thread to point to the guarded page. Example of potential use-cases: -Biased lock revocation -External requests for stack traces -Deoptimization -Async exception delivery -External suspension -Eliding memory barriers All of these will benefit the VM moving towards becoming more low-latency friendly by reducing the number of global safepoints. Platforms that do not yet implement the per JavaThread poll, a fallback to normal safepoint is in place. HandshakeOneThread will then be a normal safepoint. The supported platforms are Linux x64 and Solaris SPARC. Tested heavily with various test suits and comes with a few new tests. Performance testing using standardized benchmark show no signification changes, the latest number was -0.7% on Linux x64 and +1.5% Solaris SPARC (not statistically ensured). A minor regression for the load vs load load on x64 is expected and a slight increase on SPARC due to the cost of ‘materializing’ the page vs load load. The time to trigger a safepoint was measured on a large machine to not be an issue. The looping over threads and arming the polling page will benefit from the work on JavaThread life-cycle (8167108 - SMR and JavaThread Lifecycle: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-October/024773.html) which puts all JavaThreads in an array instead of a linked list. Thanks, Robbin



More information about the hotspot-dev mailing list