RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 (original) (raw)
Doerr, Martin martin.doerr at sap.com
Fri Oct 21 12:57:42 UTC 2016
- Previous message: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64
- Next message: ADL/ADLC documentation
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi all,
thank you very much for reviewing. I fully agree with the latest replies.
I think Hiroshi's latest webrev (http://cr.openjdk.java.net/~horii/8154736/webrev.05/) is pretty close to it. There are only still acquire barriers which could be replaced by a comment like "We rely on memory_order_consume here.". I'd prefer this, too, even though acquire barriers in failure cases would probably not really hurt. Cmpxchg Release,Relaxed + Load Consume seems to be the pattern which matches the needs exactly.
The webrev also contains a logging change in psPromotionManager.inline.hpp which I'm not sure if it's still wanted.
Not sure if aarch64 should be addressed in a separate change.
Besides that, it looks good to me.
Best regards, Martin
-----Original Message----- From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of Andrew Haley Sent: Dienstag, 11. Oktober 2016 11:26 To: Kim Barrett; David Holmes Cc: hotspot-compiler-dev; Hiroshi H Horii; Tim Ellison; ppc-aix-port-dev at openjdk.java.net; Michihiro Horie; hotspot-gc-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64
On 06/10/16 23:16, Kim Barrett wrote:
The key issue here is that we copy obj into newobj, and then make newobj accessible to other threads via the CAS. Those other threads might attempt to access data in newobj. This suggests the CAS ought to have at least a release fence to ensure the copy is complete before the CAS is performed. No amount of fencing on the read side (such as in the work stealing) can remove that need.
I agree.
And that might be all that is needed. On the post-CAS side, we load the forwardee and then load values from it. I thik we can use implicit consume with dependent loads (except on Alpha) plus the suggested release fence to get the desired effect.
That's probably true, except that there's not really any such thing as "implicit consume" in C++. While all of the hardware we use respects address dependencies, it's not something that the compiler knows about, and it's explicitly undefined behaviour in the C++ memory model. If we're depending on memory_order_consume, perhaps we ought to think about adding it to Atomic, even though it's just a volatile load in older compilers.
Andrew.
- Previous message: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64
- Next message: ADL/ADLC documentation
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]