RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 (original) (raw)
Carsten Varming varming at gmail.com
Thu Sep 29 14:47:29 UTC 2016
- Previous message: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64
- Next message: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Dear Hiroshi,
In hotspot/src/share/vm/gc/parallel/psPromotionManager.inline.hpp:266 the log line reads data from the forwardee even when the CAS fails. I believe those reads will be unsafe without barriers after the copy of the content of the object.
hotspot/src/share/vm/gc/parallel/psPromotionManager.inline.hpp:288 same problem as in line 266
I would argue that the logging should only happen if the thread successfully copied the object and CAS failures should be logged separately without reading data from the forwardee.
BTW, unrelated to your change: It seems like the logging in line 266 should be guarded by something like "if (log_develop_is_enabled(Trace, gc, scavenge)" like the logging in line 288.
Carsten
On Thu, Sep 29, 2016 at 8:00 AM, Hiroshi H Horii <HORII at jp.ibm.com> wrote:
Hi all,
Can I please request reviews for a change for 8154736 that improve copytosurvivor performance of ppc64 and aarch64? If possible, I would like to include this change into jdk9. 8154736 includes two changes, cmpxchg and copytosuvivor, and the former was resolved as 8155949. Now, I would like to ask a review for the remaining, copytosuvivor change. webrev: http://cr.openjdk.java.net/~mdoerr/8154736copytosurvivor/webrev.01/ JIRA: https://bugs.openjdk.java.net/browse/JDK-8154736 I tested this change with SPECjbb2013. Also, I re-check that relaxed cmpxchg is available for changing forwarding pointers. However, because this change is sensitive, we need more reviews not only from compiler-dev, but also from gc-dev. Regards, Hiroshi ----------------------- Hiroshi Horii, Ph.D. IBM Research - Tokyo
From: David Holmes <david.holmes at oracle.com> To: "Doerr, Martin" <martin.doerr at sap.com>, Hiroshi H Horii/Japan/IBM at IBMJP Cc: Tim Ellison <TimEllison at uk.ibm.com>, "ppc-aix-port-dev at openjdk.java.net" <ppc-aix-port-dev at openjdk.java.net>, "hotspot-gc-dev at openjdk.java.net" <hotspot-gc-dev at openjdk.java.net>, "hotspot-runtime-dev at openjdk.java.net" <hotspot-runtime-dev at openjdk.java.net> Date: 05/10/2016 19:31 Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and copytosurvivor for ppc64 On 10/05/2016 7:41 PM, Doerr, Martin wrote: > Hi David, > > thank you very much for testing the other platforms. > > Here's an updated webrev: > http://cr.openjdk.java.net/~mdoerr/8155949relaxedcas/webrev.01/ Thanks. Second test run on its way. David ----- > Best regards, > Martin > > -----Original Message----- > From: hotspot-runtime-dev [_ _mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of David Holmes > Sent: Dienstag, 10. Mai 2016 11:11 > To: Hiroshi H Horii <HORII at jp.ibm.com> > Cc: Tim Ellison <TimEllison at uk.ibm.com>; ppc-aix-port-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net > Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and copytosurvivor for ppc64 > > The fix seems incomplete for solaris: > > make/Main.gmk:232: recipe for target 'hotspot' failed > "/opt/jprt/T/P1/073516.daholme/s/hotspot/src/oscpu/ solarisx86/vm/atomicsolarisx86.inline.hpp", > line 124: Error: Too many arguments in call to > "Atomiccmpxchglong(long, volatile long*, long)". > "/opt/jprt/T/P1/073516.daholme/s/hotspot/src/oscpu/ solarisx86/vm/atomicsolarisx86.inline.hpp", > line 128: Error: Too many arguments in call to > "Atomiccmpxchglong(long, volatile long*, long)". > > David > > On 10/05/2016 5:34 PM, David Holmes wrote: >> Hi Hiroshi, >> >> On 6/05/2016 8:11 PM, Hiroshi H Horii wrote: >>> Hi David, >>> >>> Thank you for your comments. >>> >>> As Martin suggested me, I would like to separate this proposal to >>> - relaxing memory order of cmpxchg >>> - improvement of copytosurvivior with relaxed cmpxchg >>> and discuss the former first. >>> >>> Martin thankfully created a new webrev that include a change of cmpxchg. >>> http://cr.openjdk.java.net/~mdoerr/8155949relaxedcas/webrev.00/ >>> He has already tested it with AIX, linuxx8664, linuxppc64le and >>> darwinintel64. >>> (Please tell me if I need to send a new mail for this PFR) >> >> Please do as it will be simpler to track that way. >> >>>> What I would prefer to see is an additional memoryorder value (such as >>>> memoryorderignored) which is the default for all methods declared to >>>> take a memoryorder parameter. >>> >>> We added simple enum to specify memory order in atomic.hpp as follows. >>> >>> typedef enum cmpxchgcmpxchgmemoryorder { >>> memoryorderrelaxed, >>> memoryorderconservative >>> } cmpxchgmemoryorder; >>> >>> All of cmpxchg functions have an argument of cmpxchgmemoryorder >>> with a default value memoryorderconservative that uses the same >>> semantics with the existing cmpxchg and requires no change for the >>> existing >>> callers. If you think "memoryorderignored" is better than >>> "memoryorderconservative", I will be happy to modify this change. >>> (I just thought, "ignored" may resemble "relaxed" and may make >>> people who are familiar with C++11's memory semantics confused. >>> I would like to know thoughts of native speakers.) >> >> That is fine by me. I don't think "ignored" would be confused with >> "relaxed", but "conservative" is fine. >> >> I will run the patch through our internal build system while you prepare >> the updated RFR. My only concern is "unused argument" warnings from the >> compiler. :) >> >> We are quickly running into a hard deadline with Feature Complete >> however - possibly less than 24 hours - for hotspot changes. If this >> doesn't get in in time I will see if I can shepherd it through the >> approval process. >> >> Thanks, >> David >> >> >>> Regards, >>> Hiroshi >>> ----------------------- >>> Hiroshi Horii, Ph.D. >>> IBM Research - Tokyo >>> >>> >>> David Holmes <david.holmes at oracle.com> wrote on 05/04/2016 14:55:29: >>> >>>> From: David Holmes <david.holmes at oracle.com> >>>> To: Hiroshi H Horii/Japan/IBM at IBMJP >>>> Cc: hotspot-gc-dev at openjdk.java.net, hotspot-runtime- >>>> dev at openjdk.java.net, ppc-aix-port-dev at openjdk.java.net, Tim Ellison >>>> <TimEllison at uk.ibm.com>, Volker Simonis <volker.simonis at gmail.com>, >>>> "Doerr, Martin" <martin.doerr at sap.com>, "Lindenmaier, Goetz" >>>> <goetz.lindenmaier at sap.com> >>>> Date: 05/04/2016 14:57 >>>> Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and >>>> copytosurvivor for ppc64 >>>> >>>> Hi Hiroshi, >>>> >>>> Sorry for the delay on getting back to this. >>>> >>>> On 25/04/2016 5:09 PM, Hiroshi H Horii wrote: >>>>> Hi David, >>>>> >>>>> Thank you for your comments and questions. >>>>> >>>>>> 1. Are the current cmpxchg semantics exactly the same as >>>>>> memoryorderseqcst? >>>>> >>>>> This is very good question.. >>>>> >>>>> I guess, cmpxchg needs a more conservative constraint for memory >>> ordering >>>>> than C++11, to add sync after a compare-and-exchange operation. >>>>> >>>>> Could someone give comments or thoughts? >>>> >>>> I don't want to comment on the comparison with C++11. What I would >>>> prefer to see is an additional memoryorder value (such as >>>> memoryorderignored) which is the default for all methods declared to >>>> take a memoryorder parameter. That way existing implementations are >>>> clearly ignoring the memoryorder attribute and there is no potential >>>> for confusion as to whether the existing implementations equate to >>>> memoryorderseqcst or not. >>>> >>>> That said, I'm not sure it makes sense to add the memoryorder parameter >>>> to all methods with "cas" in their name, e.g. oopDesc::cassetmark, >>>> oopDesc::casforwardto, unless those methods can sensibly be called >>>> with any value for memoryorder - which seems highly unlikely. Perhaps >>>> those methods should identify the weakest form of memoryorder they >>>> support and that should be hard-wired into them? >>>> >>>> Thanks, >>>> David >>>> >>>>> memoryorderseqcst is defined as >>>>> "Any operation with this memory order is both an acquire >>> operation and >>>>> a release operation, plus a single total order exists in which >>>> all >>>>> threads >>>>> observe all modifications (see below) in the same order." >>>>> (http://en.cppreference.com/w/cpp/atomic/memoryorder) >>>>> >>>>> In my environment, g++ and xlc generate following assemblies on >>>> ppc64le. >>>>> (interestingly, they generates the same assemblies for any >>>> memoryorder) >>>>> >>>>> g++ (4.9.2) >>>>> 100008a4: ac 04 00 7c sync >>>>> 100008a8: 28 50 20 7d lwarx r9,0,r10 >>>>> 100008ac: 00 18 09 7c cmpw r9,r3 >>>>> 100008b0: 0c 00 c2 40 bne- 100008bc >>>>> 100008b4: 2d 51 80 7c stwcx. r4,0,r10 >>>>> 100008b8: f0 ff c2 40 bne- 100008a8 >>>>> 100008bc: 2c 01 00 4c isync >>>>> >>>>> xlc (13.1.3) >>>>> 10000888: ac 04 00 7c sync >>>>> 1000088c: 28 28 c0 7c lwarx r6,0,r5 >>>>> 10000890: 40 00 26 7c cmpld r6,r0 >>>>> 10000894: 0c 00 82 40 bne 100008a0 >>>>> 10000898: 2d 29 80 7c stwcx. r4,0,r5 >>>>> 1000089c: f0 ff e2 40 bne+ 1000088c >>>>> 100008a0: 2c 01 00 4c isync >>>>> >>>>> On the other hand, the current OpenJDK generates following assemblies. >>>>> >>>>> 508: ac 04 00 7c sync >>>>> 50c: 00 00 5c e9 ld r10,0(r28) >>>>> 510: 00 50 3b 7c cmpd r27,r10 >>>>> 514: 1c 00 c2 40 bne- 530 >>>>> 518: a8 40 5c 7d ldarx r10,r28,r8 >>>>> 51c: 00 50 3b 7c cmpd r27,r10 >>>>> 520: 10 00 c2 40 bne- 530 >>>>> 524: ad 41 3c 7d stdcx. r9,r28,r8 >>>>> 528: f0 ff c2 40 bne- 518 >>>>> 52c: ac 04 00 7c sync >>>>> 530: 00 50 bb 7f ... >>>>> >>>>> Though we can ignore 50c-514 (because they are a duplicated guard >>>>> condition), >>>>> the last sync instruction (52c) makes cmpxchg more strict than >>>>> memoryorderseqcst. >>>>> >>>>> In some cases, the last sync is necessary when this thread must be >>>> able >>>>> to read >>>>> all of the changes in the other threads while executing from 508 to >>>> 530 >>>>> (that processes compare-and-exchange). >>>>> >>>>>> 2. Has there been a discussion already, establishing that the >>>> modified >>>>>> GC code can indeed use memoryorderrelaxed? Otherwise who is >>>>>> postulating that and based on what evidence? >>>>> >>>>> Volker and his colleagues have investigated the current GC codes >>>>> according to this. >>>>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016- >>>> April/019079.html >>>>> However, I believe, we need comments of other GC experts to change >>>>> the shared codes. >>>>> >>>>> Regards, >>>>> Hiroshi >>>>> ----------------------- >>>>> Hiroshi Horii, Ph.D. >>>>> IBM Research - Tokyo >>>>> >>>>> >>>>> David Holmes <david.holmes at oracle.com> wrote on 04/22/2016 21:57:07: >>>>> >>>>>> From: David Holmes <david.holmes at oracle.com> >>>>>> To: Hiroshi H Horii/Japan/IBM at IBMJP, hotspot-runtime- >>>>>> dev at openjdk.java.net, hotspot-gc-dev at openjdk.java.net >>>>>> Cc: Tim Ellison <TimEllison at uk.ibm.com>, >>>>> ppc-aix-port-dev at openjdk.java.net >>>>>> Date: 04/22/2016 21:58 >>>>>> Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and >>>>>> copytosurvivor for ppc64 >>>>>> >>>>>> Hi Hiroshi, >>>>>> >>>>>> Two initial questions: >>>>>> >>>>>> 1. Are the current cmpxchg semantics exactly the same as >>>>>> memoryorderseqcst? >>>>>> >>>>>> 2. Has there been a discussion already, establishing that the >>>> modified >>>>>> GC code can indeed use memoryorderrelaxed? Otherwise who is >>>>>> postulating that and based on what evidence? >>>>>> >>>>>> Missing memory barriers have caused very difficult to track down >>> bugs in >>>>>> the past - very rare race conditions. So any relaxation here has >>>> to be >>>>>> done with extreme confidence. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>> On 22/04/2016 10:28 PM, Hiroshi H Horii wrote: >>>>>>> Dear all: >>>>>>> >>>>>>> Can I please request reviews for the following change? >>>>>>> >>>>>>> Code change: >>>>>>> >>> http://cr.openjdk.java.net/~mdoerr/8154736copytosurvivor/webrev.00/ >>>>>>> (I initially created and Martin enhanced so much) >>>>>>> >>>>>>> This change follows the discussion started from this mail. >>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016- >>>>>> April/018960.html >>>>>>> >>>>>>> Description: >>>>>>> This change provides relaxed compare-and-exchange by introducing >>>>>>> similar semantics of C++ atomic memory operators, enum >>>> memoryorder. >>>>>>> As described in atomiclinuxppc.inline.hpp, the current >>>>> implementation of >>>>>>> cmpxchg is fencecmpxchgacquire. This implementation is useful for >>>>>>> general purposes because twice calls of sync before and after >>>>> cmpxchg will >>>>>>> provide strict consistency. However, they sometimes cause overheads >>>>>>> because >>>>>>> sync instructions are very expensive in the current POWER chip >>> design. >>>>>>> In addition, for the other platforms, such as aarch64, this strict >>>>>>> semantics >>>>>>> may cause some overheads (according to the Andrew's mail). >>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016- >>>>>> April/019073.html >>>>>>> >>>>>>> With this change, callers can explicitly specify constraints of >>> memory >>>>>>> ordering >>>>>>> for cmpxchg with an additional parameter, memoryorder order. >>>>>>> >>>>>>> typedef enum memoryorder { >>>>>>> memoryorderrelaxed, >>>>>>> memoryorderconsume, >>>>>>> memoryorderacquire, >>>>>>> memoryorderrelease, >>>>>>> memoryorderacqrel, >>>>>>> memoryorderseqcst >>>>>>> } memoryorder; >>>>>>> >>>>>>> Because the default value of the parameter is memoryorderseqcst, >>>>>>> existing codes can use the same semantics of cmpxchg without any >>>>>>> modification. The relaxed cmpxchg is implemented only on ppc >>>>>>> in this changeset. Therefore, the behavior on the other platforms >>> will >>>>>>> not be changed with this changeset. >>>>>>> >>>>>>> In addition, with the new parameter of cmpxchg, this change >>>> improves >>>>>>> performance of copytosurvivor in the parallel GC. >>>>>>> copytosurvivor changes forward pointers by using cmpxchg. This >>>>>>> operation doesn't require any sync instructions. A pointer is >>> changed >>>>>>> at most once in a GC and when cmpxchg fails, the latest pointer is >>>>>>> available for the caller. cassetmark and casforwardto are >>> extended >>>>>>> with an additional memoryorder parameter as cmpxchg and >>>>> copytosurvivor >>>>>>> uses memoryorderrelaxed to modify the forward pointers. >>>>>>> >>>>>>> Summary of source code changes: >>>>>>> >>>>>>> * src/share/vm/runtime/atomic.hpp >>>>>>> - Defines enum memoryorder and adds a parameter to cmpxchg. >>>>>>> >>>>>>> * src/share/vm/runtime/atomic.cpp >>>>>>> * src/oscpu/bsdx86/vm/atomicbsdx86.inline.hpp >>>>>>> * src/oscpu/bsdzero/vm/atomicbsdzero.inline.hpp >>>>>>> * src/oscpu/linuxaarch64/vm/atomiclinuxaarch64.inline.hpp >>>>>>> * src/oscpu/linuxsparc/vm/atomiclinuxsparc.inline.hpp >>>>>>> * src/oscpu/linuxx86/vm/atomiclinuxx86.inline.hpp >>>>>>> * src/oscpu/linuxzero/vm/atomiclinuxzero.inline.hpp >>>>>>> * src/oscpu/solarissparc/vm/atomicsolarissparc.inline.hpp >>>>>>> * src/oscpu/solarisx86/vm/atomicsolarisx86.inline.hpp >>>>>>> * src/oscpu/windowsx86/vm/atomicwindowsx86.inline.hpp >>>>>>> - Added a parameter for each cmpxchg function to follow >>>>>>> the change of atomic.hpp. Their implementations are not >>>>> changed. >>>>>>> >>>>>>> * src/oscpu/aixppc/vm/atomicaixppc.inline.hpp >>>>>>> * src/oscpu/linuxppc/vm/atomiclinuxppc.inline.hpp >>>>>>> - Added a parameter for each cmpxchg function to follow >>>>>>> the change of atomic.hpp. In addition, implementations >>>>>>> are changed corresponding to the specified memoryorder. >>>>>>> >>>>>>> * src/share/vm/oops/oop.hpp >>>>>>> * src/share/vm/oops/oop.inline.hpp >>>>>>> - Add a memoryorder parameter to use relaxed cmpxchg in >>>>>>> cassetmark and casforwardto. >>>>>>> >>>>>>> * src/share/vm/gc/parallel/psPromotionManager.cpp >>>>>>> * src/share/vm/gc/parallel/psPromotionManager.inline.hpp >>>>>>> >>>>>>> Martin tested this changeset on linuxx8664, linuxppc64le and >>>>>>> darwinintel64. >>>>>>> Though more time is needed to test on the other platform, we would >>>>> like to >>>>>>> ask >>>>>>> reviews and start discussion on this changeset. >>>>>>> I also tested this changeset with SPECjbb2013 and confirmed that gc >>>>> pause >>>>>>> time >>>>>>> is reduced. >>>>>>> >>>>>>> Regards, >>>>>>> Hiroshi >>>>>>> ----------------------- >>>>>>> Hiroshi Horii, Ph.D. >>>>>>> IBM Research - Tokyo >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>>
-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.openjdk.java.net/pipermail/ppc-aix-port-dev/attachments/20160929/4dd7f3a4/attachment-0001.html>
- Previous message: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64
- Next message: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]