RFR(M): 8029101: PPC64 (part 211): ordering of Independent Reads of Independent Writes (original) (raw)

Volker Simonis volker.simonis at gmail.com
Wed Jan 22 09:42:18 PST 2014


On Wed, Jan 22, 2014 at 6:02 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:

Hi Goetz

On 1/22/14 1:20 AM, Lindenmaier, Goetz wrote:

Hi Vladimir, Thanks for testing and pushing! Will you push this also to stage? I assume we can handle this as the other three hotspot changes, without a new bug-id? Yes, I will backport it. What about JDK changes Volker pushed: 8028537, 8031134, 8031997 and new one from today 8031581? Should I backport all of them into 8u stage? From conversion between Volker and Alan some of them need backport a fix from jdk9. Or I am mistaking?

It would be great if you could push 8028537, 8031134, 8031997 and 8031581 into 8u stage.

There is no real source-code level dependency between these changes and the change "7133499: (fc) FileChannel.read not preempted by asynchronous close on OS X)" which is still to be pushed by Alan to jdk9. However the AIX-port will automatically benefit from 7133499 once it will be pushed.

Also, when do you think we (you unfortunately) should update the repos again? Stage-9 maybe after Volkers last change is submitted? After I test and push 8031581 I will do sync with latest jdk9 sources (b01).

It would be great if you could push the change from:

http://cr.openjdk.java.net/~simonis/webrevs/8031581_3

It is exactly the same like the one from http://cr.openjdk.java.net/~simonis/webrevs/8031581_2 with only one comment line changed to fix the typo "legel"->"legal" in src/share/native/java/util/zip/zip_util.c .

Thanks a lot, Volker

I will build bundles and give them to SQE for final testing.

Thanks, Vladimir

Best regards, Goetz

-----Original Message----- From: hotspot-dev-bounces at openjdk.java.net [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Vladimir Kozlov Sent: Dienstag, 21. Januar 2014 21:00 Cc: 'ppc-aix-port-dev at openjdk.java.net'; 'hotspot-dev at openjdk.java.net' Subject: Re: RFR(M): 8029101: PPC64 (part 211): ordering of Independent Reads of Independent Writes Thanks. I am pushing it. Vladimir On 1/21/14 5:19 AM, Lindenmaier, Goetz wrote: Sorry, I missed that. fixed. Best regards, Goetz. -----Original Message----- From: David Holmes [mailto:david.holmes at oracle.com] Sent: Dienstag, 21. Januar 2014 12:53 To: Lindenmaier, Goetz; Vladimir Kozlov Cc: 'ppc-aix-port-dev at openjdk.java.net'; 'hotspot-dev at openjdk.java.net' Subject: Re: RFR(M): 8029101: PPC64 (part 211): ordering of Independent Reads of Independent Writes Thanks Goetz! This typo still exists: + bool wrotevolatile; // Did we write a final field? s/final/volatile/ Otherwise no further comments from me. David On 21/01/2014 7:22 PM, Lindenmaier, Goetz wrote: Hi, I made a new webrev http://cr.openjdk.java.net/~goetz/webrevs/8029101-3-raw/ differing from http://cr.openjdk.java.net/~goetz/webrevs/8029101-2-raw/ only in the comments. I removed // Support ordering of "Independent Reads of Independent Writes". everywhere, and edited the comments in the globalDefinition*.hpp files. Best regards, Goetz. -----Original Message----- From: David Holmes [mailto:david.holmes at oracle.com] Sent: Dienstag, 21. Januar 2014 05:55 To: Lindenmaier, Goetz; Vladimir Kozlov Cc: 'ppc-aix-port-dev at openjdk.java.net'; 'hotspot-dev at openjdk.java.net' Subject: Re: RFR(M): 8029101: PPC64 (part 211): ordering of Independent Reads of Independent Writes Hi Goetz, On 17/01/2014 6:39 PM, Lindenmaier, Goetz wrote:

Hi, I tried to come up with a webrev that implements the change as proposed in your mails: http://cr.openjdk.java.net/~goetz/webrevs/8029101-2-raw/ Wherever I used CPUNOTMULTIPLECOPYATOMIC, I use supportIRIWfornotmultiplecopyatomiccpu.

Given the flag name the commentary eg: + // Support ordering of "Independent Reads of Independent Writes". + if (supportIRIWfornotmultiplecopyatomiccpu) { seems somewhat redundant. I left the definition and handling of wrotevolatile in the code, without any protection. + bool wrotevolatile; // Did we write a final field? s/final/volatile I protected issuing the barrier for volatile in constructors with PPC64ONLY() , and put it on one line. I removed the comment in librarycall.cpp. I also removed the sentence " Solution: implement volatile read as sync-load-acquire." from the comments as it's PPC specific. I think the primary IRIW comment/explanation should go in globalDefinitions.hpp where supportIRIWfornotmultiplecopyatomiccpu is defined. Wrt. to C1: we plan to port C1 to PPC64, too. During that task, we will fix these issues in C1 if nobody did it by then. I've filed: https://bugs.openjdk.java.net/browse/JDK-8032366 "Implement C1 support for IRIW conformance on non-multiple-copy-atomic platforms" to cover this task, as it may be needed sooner rather than later. Wrt. to performance: Oracle will soon do heavy testing of the port. If any performance problems arise, we still can add #ifdef PPC64 to circumvent this. Ok. Thanks, David Best regards, Goetz.

-----Original Message----- From: David Holmes [mailto:david.holmes at oracle.com] Sent: Donnerstag, 16. Januar 2014 10:05 To: Vladimir Kozlov Cc: Lindenmaier, Goetz; 'ppc-aix-port-dev at openjdk.java.net'; 'hotspot-dev at openjdk.java.net' Subject: Re: RFR(M): 8029101: PPC64 (part 211): ordering of Independent Reads of Independent Writes On 16/01/2014 6:54 PM, Vladimir Kozlov wrote: On 1/16/14 12:34 AM, David Holmes wrote: On 16/01/2014 5:13 PM, Vladimir Kozlov wrote: This is becoming ugly #ifdef mess. In compiler code we are trying to avoid them. I suggested to have wrotevolatile without #ifdef and I want to keep it this way, it could be useful to have such info on other platforms too. But I would suggest to remove PPC64 comments in parse.hpp. In globalDefinitions.hpp after globalDefinitionsppc.hpp define a value which could be checked in all places instead of #ifdef:

I asked for the ifdef some time back as I find it much preferable to have this as a build-time construct rather than a runtime one. I don't want to have to pay anything for this if we don't use it. Any decent C++ compiler will optimize expressions with such constants defined in header files. I insist to avoid #ifdefs in C2 code. I really don't like the code with #ifdef in unsafe.cpp but I can live with it. If you insist then we may as well do it all the same way. Better to be consistent. My apologies Goetz for wasting your time going back and forth on this. That aside I have a further concern with this IRIW support - it is incomplete as there is no C1 support, as PPC64 isn't using client. If this is going on then we (which probably means the Oracle 'we') need to add the missing C1 code. David ----- Vladimir David #ifdef CPUNOTMULTIPLECOPYATOMIC const bool supportIRIWfornotmultiplecopyatomiccpu = true; #else const bool supportIRIWfornotmultiplecopyatomiccpu = false; #endif or supportIRIWfornotmultiplecopyatomiccpu, whatever and then: _#define GETFIELDVOLATILE(obj, offset, typename, v) _ _oop p = JNIHandles::resolve(obj); _ + if (supportIRIWfornotmultiplecopyatomiccpu) _OrderAccess::fence(); _ volatile typename v = OrderAccess::loadacquire((volatile typename*)indexoopfromfieldoffsetlong(p, offset)); And: + if (supportIRIWfornotmultiplecopyatomiccpu && field->isvolatile()) { + insertmembar(OpMemBarVolatile); // StoreLoad barrier + } And so on. The comments will be needed only in globalDefinitions.hpp The code in parse1.cpp could be put on one line: + if (wrotefinal() PPC64ONLY( || (wrotevolatile() && method()->isinitializer()) )) { Thanks, Vladimir On 1/15/14 9:25 PM, David Holmes wrote: On 16/01/2014 1:28 AM, Lindenmaier, Goetz wrote: Hi David, I updated the webrev: http://cr.openjdk.java.net/~goetz/webrevs/8029101-1-raw/ - I removed the IRIW example in parse3.cpp - I adapted the comments not to point to that comment, and to reflect the new flagging. Also I mention that we support the volatile constructor issue, but that it's not standard. - I protected issuing the barrier for the constructor by PPC64. I also think it's better to separate these this way.

Sorry if I wasn't clear but I'd like the wrotevolatile field declaration and all uses to be guarded by ifdef PPC64 too please. One nit I missed before. In src/share/vm/opto/librarycall.cpp this comment doesn't make much sense to me and refers to ppc specific stuff in a shared file: if (isvolatile) { ! if (!isstore) { insertmembar(OpMemBarAcquire); ! } else { ! #ifndef CPUNOTMULTIPLECOPYATOMIC ! // Changed volatiles/Unsafe: lwsync-store, sync-load-acquire. insertmembar(OpMemBarVolatile); + #endif + } I don't think the comment is needed. Thanks, David Thanks for your comments! Best regards, Goetz. -----Original Message----- From: David Holmes [mailto:david.holmes at oracle.com] Sent: Mittwoch, 15. Januar 2014 01:55 To: Lindenmaier, Goetz Cc: 'ppc-aix-port-dev at openjdk.java.net'; 'hotspot-dev at openjdk.java.net' Subject: Re: RFR(M): 8029101: PPC64 (part 211): ordering of Independent Reads of Independent Writes Hi Goetz, Sorry for the delay in getting back to this. The general changes to the volatile barriers to support IRIW are okay. The guard of CPUNOTMULTIPLECOPYATOMIC works for this (though more specifically it is not-multiple-copy-atomic-and-chooses-to-support-IRIW). I find much of the commentary excessive, particularly for shared code. In particular the IRIW example in parse3.cpp - it seems a strange place to give the explanation and I don't think we need it to that level of detail. Seems to me that is present is globalDefinitionsppc.hpp is quite adequate. The changes related to volatile writes in the constructor, as discussed are not required by the Java Memory Model. If you want to keep these then I think they should all be guarded with PPC64 because it is not related to CPUNOTMULTIPLECOPYATOMIC but a choice being made by the PPC64 porters. Thanks, David On 14/01/2014 11:52 PM, Lindenmaier, Goetz wrote: Hi, I updated this webrev. I detected a small flaw I made when editing this version. The #endif in line 322, parse3.cpp was in the wrong line. I also based the webrev on the latest version of the stage repo. http://cr.openjdk.java.net/~goetz/webrevs/8029101-1-raw/ Best regards, Goetz. -----Original Message----- From: Lindenmaier, Goetz Sent: Freitag, 20. Dezember 2013 13:47 To: David Holmes Cc: 'hotspot-dev at openjdk.java.net'; 'ppc-aix-port-dev at openjdk.java.net' Subject: RE: RFR(M): 8029101: PPC64 (part 211): ordering of Independent Reads of Independent Writes Hi David, So we can at least undo #4 now we have established those tests were not required to pass. We would prefer if we could keep this in. We want to avoid that it's blamed on the VM if java programs are failing on PPC after they worked on x86. To clearly mark it as overfulfilling the spec I would guard it by a flag as proposed. But if you insist I will remove it. Also, this part is not that performance relevant. A compile-time guard (ifdef) would be better than a runtime one I think I added a compile-time guard in this new webrev: http://cr.openjdk.java.net/~goetz/webrevs/8029101-1-raw/ I've chosen CPUNOTMULTIPLECOPYATOMIC. This introduces several double negations I don't like, (#ifNdef CPUNOTMULTIPLECOPYATOMIC) but this way I only have to change the ppc platform. Best regards, Goetz P.S.: I will also be available over the Christmas period. -----Original Message----- From: David Holmes [mailto:david.holmes at oracle.com] Sent: Freitag, 20. Dezember 2013 05:58 To: Lindenmaier, Goetz Cc: 'hotspot-dev at openjdk.java.net'; 'ppc-aix-port-dev at openjdk.java.net' Subject: Re: RFR(M): 8029101: PPC64 (part 211): ordering of Independent Reads of Independent Writes Sorry for the delay, it takes a while to catch up after two weeks vacation :) Next vacation (ie next two weeks) I'll continue to check emails. On 2/12/2013 6:33 PM, Lindenmaier, Goetz wrote: Hi, ok, I understand the tests are wrong. It's good this issue is settled. Thanks Aleksey and Andreas for going into the details of the proof! About our change: David, the causality is the other way round. The change is about IRIW. 1. To pass IRIW, we must use sync instructions before loads.

This is the part I still have some question marks over as the implications are not nice for performance on non-TSO platforms. But I'm no further along in processing that paper I'm afraid. 2. If we do syncs before loads, we don't need to do them after stores. 3. If we don't do them after stores, we fail the volatile constructor tests. 4. So finally we added them again at the end of the constructor after stores to pass the volatile constructor tests. So we can at least undo #4 now we have established those tests were not required to pass. We originally passed the constructor tests because the ppc memory order instructions are not as find-granular as the operations in the IR. MemBarVolatile is specified as StoreLoad. The only instruction on PPC that does StoreLoad is sync. But sync also does StoreStore, therefore the MemBarVolatile after the store fixes the constructor tests. The proper representation of the fix in the IR would be adding a MemBarStoreStore. But now it's pointless anyways. I'm not happy with the ifdef approach but I won't block it. I'd be happy to add a property OrderAccess::cpuismultiplecopyatomic() A compile-time guard (ifdef) would be better than a runtime one I think - similar to the SUPPORTSNATIVECX8 optimization (something semantic based not architecture based) as that will allows for turning this on/off for any architecture for testing purposes. Thanks, David or the like to guard the customization. I'd like that much better. Or also OrderAccess::needssupportiriwordering() VMVersion::needssupportiriwordering()

Best regards, Goetz.

-----Original Message----- From: David Holmes [mailto:david.holmes at oracle.com] Sent: Donnerstag, 28. November 2013 00:34 To: Lindenmaier, Goetz Cc: 'hotspot-dev at openjdk.java.net'; 'ppc-aix-port-dev at openjdk.java.net' Subject: Re: RFR(M): 8029101: PPC64 (part 211): ordering of Independent Reads of Independent Writes TL;DR version: Discussion on the c-i list has now confirmed that a constructor-barrier for volatiles is not required as part of the JMM specification. It may be required in an implementation that doesn't pre-zero memory to ensure you can't see uninitialized fields. So the tests for this are invalid and this part of the patch is not needed in general (ppc64 may need it due to other factors). Re: "multiple copy atomicity" - first thanks for correcting the term :) Second thanks for the reference to that paper! For reference: "The memory system (perhaps involving a hierarchy of buffers and a complex interconnect) does not guarantee that a write becomes visible to all other hardware threads at the same time point; these architectures are not multiple-copy atomic." This is the visibility issue that I referred to and affects both ARM and PPC. But of course it is normally handled by using suitable barriers after the stores that need to be visible. I think the crux of the current issue is what you wrote below: > The fixes for the constructor issue are only needed because we > remove the sync instruction from behind stores (parse3.cpp:320) > and place it before loads. I hadn't grasped this part. Obviously if you fail to do the sync after the store then you have to do something around the loads to get the same results! I still don't know what lead you to the conclusion that the only way to fix the IRIW issue was to put the fence before the load - maybe when I get the chance to read that paper in full it will be clearer. So ... the basic problem is that the current structure in the VM has hard-wired one choice of how to get the right semantics for volatile variables. You now want to customize that but not all the requisite hooks are present. It would be better if volatileload and volatilestore were factored out so that they could be implemented as desired per-platform. Alternatively there could be pre- and post- hooks that could then be customized per platform. Otherwise you need platform-specific ifdef's to handle it as per your patch. I'm not happy with the ifdef approach but I won't block it. I think this is an area where a lot of clean up is needed in the VM. The barrier abstractions are a confused mess in my opinion. Thanks, David ----- On 28/11/2013 3:15 AM, Lindenmaier, Goetz wrote: Hi, I updated the webrev to fix the issues mentioned by Vladimir: http://cr.openjdk.java.net/~goetz/webrevs/8029101-0-raw/ I did not yet add the OrderAccess::needssupportiriwordering() VMVersion::needssupportiriwordering() or OrderAccess::cpuismultiplecopyatomic() to reduce #defined, as I got no further comment on that. WRT to the validity of the tests and the interpretation of the JMM I feel not in the position to contribute substantially. But we would like to pass the torture test suite as we consider this a substantial task in implementing a PPC port. Also we think both tests show behavior a programmer would expect. It's bad if Java code runs fine on the more common x86 platform, and then fails on ppc. This will always first be blamed on the VM. The fixes for the constructor issue are only needed because we remove the sync instruction from behind stores (parse3.cpp:320) and place it before loads. Then there is no sync between volatile store and publishing the object. So we add it again in this one case (volatile store in constructor). @David Sure. There also is no solution as you require for the taskqueue problem yet, and that's being discussed now for almost a year. It may have started a year ago but work on it has hardly been continuous. That's not true, we did a lot of investigation and testing on this issue. And we came up with a solution we consider the best possible. If you have objections, you should at least give the draft of a better solution, we would volunteer to implement and test it. Similarly, we invested time in fixing the concurrency torture issues. @David What is "multiple-read-atomicity"? I'm not familiar with the term and can't find any reference to it. We learned about this reading "A Tutorial Introduction to the ARM and POWER Relaxed Memory Models" by Luc Maranget, Susmit Sarkar and Peter Sewell, which is cited in "Correct and Efficient Work-Stealing for Weak Memory Models" by Nhat Minh LĂȘ, Antoniu Pop, Albert Cohen and Francesco Zappa Nardelli (PPoPP `13) when analysing the taskqueue problem. http://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf I was wrong in one thing, it's called multiple copy atomicity, I used 'read' instead. Sorry for that. (I also fixed that in the method name above). Best regards and thanks for all your involvements, Goetz. -----Original Message----- From: David Holmes [mailto:david.holmes at oracle.com] Sent: Mittwoch, 27. November 2013 12:53 To: Lindenmaier, Goetz Cc: 'hotspot-dev at openjdk.java.net'; 'ppc-aix-port-dev at openjdk.java.net' Subject: Re: RFR(M): 8029101: PPC64 (part 211): ordering of Independent Reads of Independent Writes Hi Goetz, On 26/11/2013 10:51 PM, Lindenmaier, Goetz wrote: Hi David, -- Volatile in constuctor AFAIK we have not seen those tests fail due to a missing constructor barrier. We see them on PPC64. Our test machines have typically 8-32 processors and are Power 5-7. But see also Aleksey's mail. (Thanks Aleksey!) And see follow ups - the tests are invalid. -- IRIW issue I can not possibly answer to the necessary level of detail with a few moments thought. Sure. There also is no solution as you require for the taskqueue problem yet, and that's being discussed now for almost a year. It may have started a year ago but work on it has hardly been continuous. You are implying there is a problem here that will impact numerous platforms (unless you can tell me why ppc is so different?) No, only PPC does not have 'multiple-read-atomicity'. Therefore I contributed a solution with the #defines, and that's correct for all, but not nice, I admit. (I don't really know about ARM, though). So if I can write down a nicer solution testing for methods that are evaluated by the C-compiler I'm happy. The problem is not that IRIW is not handled by the JMM, the problem is that store sync does not assure multiple-read-atomicity, only sync load does so on PPC. And you require multiple-read-atomicity to pass that test. What is "multiple-read-atomicity"? I'm not familiar with the term and can't find any reference to it. Thanks, David The JMM is fine. And store MemBarVolatile is fine on x86, sparc etc. as there exist assembler instructions that do what is required. So if you are off soon, please let's come to a solution that might be improvable in the way it's implemented, but that allows us to implement a correct PPC64 port. Best regards, Goetz.

-----Original Message----- From: David Holmes [mailto:david.holmes at oracle.com] Sent: Tuesday, November 26, 2013 1:11 PM To: Lindenmaier, Goetz Cc: 'Vladimir Kozlov'; 'Vitaly Davidovich'; 'hotspot-dev at openjdk.java.net'; 'ppc-aix-port-dev at openjdk.java.net' Subject: Re: RFR(M): 8029101: PPC64 (part 211): ordering of Independent Reads of Independent Writes Hi Goetz, On 26/11/2013 9:22 PM, Lindenmaier, Goetz wrote: Hi everybody, thanks a lot for the detailed reviews! I'll try to answer to all in one mail. Volatile fields written in constructor aren't guaranteed by JMM to occur before the reference is assigned; We don't think it's correct if we omit the barrier after initializing a volatile field. Previously, we discussed this with Aleksey Shipilev and Doug Lea, and they agreed. Also, concurrency torture tests LongVolatileTest AtomicIntegerInitialValueTest will fail. (In addition, observing 0 instead of the inital value of a volatile field would be very counter-intuitive for Java programmers, especially in AtomicInteger.) The affects of unsafe publication are always surprising - volatiles do not add anything special here. AFAIK there is nothing in the JMM that requires the constructor barrier - discussions with Doug and Aleksey notwithstanding. AFAIK we have not seen those tests fail due to a missing constructor barrier. proposed for PPC64 is to make volatile reads extremely heavyweight Yes, it costs measurable performance. But else it is wrong. We don't see a way to implement this cheaper. - these algorithms should be expressed using the correct OrderAccess operations Basically, I agree on this. But you also have to take into account that due to the different memory ordering instructions on different platforms just implementing something empty is not sufficient. An example: MemBarRelease // means LoadStore, StoreStore barrier MemBarVolatile // means StoreLoad barrier If these are consecutively in the code, sparc code looks like this: MemBarRelease --> membar(Assembler::LoadStore | Assembler::StoreStore) MemBarVolatile --> membar(Assembler::StoreLoad) Just doing what is required. On Power, we get suboptimal code, as there are no comparable, fine grained operations: MemBarRelease --> lwsync // Doing LoadStore, StoreStore, LoadLoad MemBarVolatile --> sync // // Doing LoadStore, StoreStore, LoadLoad, StoreLoad obviously, the lwsync is superfluous. Thus, as PPC operations are more (too) powerful, I need an additional optimization that removes the lwsync. I can not implement MemBarRelease empty, as it is also used independently. Back to the IRIW problem. I think here we have a comparable issue. Doing the MemBarVolatile or the OrderAccess::fence() before the read is inefficient on platforms that have multiple-read-atomicity. I would propose to guard the code by VMVersion::cpuismultiplereadatomic() or even better OrderAccess::cpuismultiplereadatomic() Else, David, how would you propose to implement this platform independent? (Maybe we can also use above method in taskqueue.hpp.) I can not possibly answer to the necessary level of detail with a few moments thought. You are implying there is a problem here that will impact numerous platforms (unless you can tell me why ppc is so different?) and I can not take that on face value at the moment. The only reason I can see IRIW not being handled by the JMM requirements for volatile accesses is if there are global visibility issues that are not addressed - but even then I would expect heavy barriers at the store would deal with that, not at the load. (This situation reminds me of the need for read-barriers on Alpha architecture due to the use of software cache-coherency rather than hardware cache-coherency - but we don't have that on ppc!) Sorry - There is no quick resolution here and in a couple of days I will be heading out on vacation for two weeks. David ----- Best regards, Goetz. -- Other ports: The IRIW issue requires at least 3 processors to be relevant, so it might not happen on small machines. But I can use PPCONLY instead of PPC64ONLY if you request so (and if we don't get rid of them). -- MemBarStoreStore after initialization I agree we should not change it in the ppc port. If you wish, I can prepare an extra webrev for hotspot-comp.

-----Original Message----- From: David Holmes [mailto:david.holmes at oracle.com] Sent: Tuesday, November 26, 2013 2:49 AM To: Vladimir Kozlov Cc: Lindenmaier, Goetz; 'hotspot-dev at openjdk.java.net'; 'ppc-aix-port-dev at openjdk.java.net' Subject: Re: RFR(M): 8029101: PPC64 (part 211): ordering of Independent Reads of Independent Writes Okay this is my second attempt at answering this in a reasonable way :) On 26/11/2013 10:51 AM, Vladimir Kozlov wrote: I have to ask David to do correctness evaluation. From what I understand what we see here is an attempt to fix an existing issue with the implementation of volatiles so that the IRIW problem is addressed. The solution proposed for PPC64 is to make volatile reads extremely heavyweight by adding a fence() when doing the load. Now if this was purely handled in ppc64 source code then I would be happy to let them do whatever they like (surely this kills performance though!). But I do not agree with the changes to the shared code that allow this solution to be implemented - even with PPC64ONLY this is polluting the shared code. My concern is similar to what I said with the taskQueue changes - these algorithms should be expressed using the correct OrderAccess operations to guarantee the desired properties independent of architecture. If such a "barrier" is not needed on a given architecture then the implementation in OrderAccess should reduce to a no-op. And as Vitaly points out the constructor barriers are not needed under the JMM. I am fine with suggested changes because you did not change our current code for our platforms (please, do not change doexits() now). But may be it should be done using more general query which is set depending on platform: OrderAccess::needssupportiriwordering() or similar to what we use now: VMVersion::needssupportiriwordering() Every platform has to support IRIW this is simply part of the Java Memory Model, there should not be any need to call this out explicitly like this. Is there some subtlety of the hardware I am missing here? Are there visibility issues beyond the ordering constraints that the JMM defines? From what I understand our ppc port is also affected. David? We can not discuss that on an OpenJDK mailing list - sorry. David ----- In librarycall.cpp can you add {}? New comment should be inside else {}. I think you should make wrotevolatile field not ppc64 specific which will be set to 'true' only on ppc64. Then you will not need PPC64ONLY() except in doputxxx() where it is set to true. Too many #ifdefs. In doputxxx() can you combine your changes: if (isvol) { // See comment in dogetxxx(). #ifndef PPC64 insertmembar(OpMemBarVolatile); // Use fat membar #else if (isfield) { // Add MemBarRelease for constructors which write volatile field (PPC64). setwrotevolatile(true); } #endif } Thanks, Vladimir On 11/25/13 8:16 AM, Lindenmaier, Goetz wrote: Hi, I preprared a webrev with fixes for PPC for the VolatileIRIWTest of the torture test suite: http://cr.openjdk.java.net/~goetz/webrevs/8029101-0-raw/ Example: volatile x=0, y=0



| Thread 0 | | Thread 1 | | Thread 2 | | Thread 3 |

write(x=1) read(x) write(y=1) read(y) read(y) read(x) Disallowed: x=1, y=0 y=1, x=0

Solution: This example requires multiple-copy-atomicity. This is only assured by the sync instruction and if it is executed in the threads doing the loads. Thus we implement volatile read as sync-load-acquire and omit the sync/MemBarVolatile after the volatile store. MemBarVolatile happens to be implemented by sync. We fix this in C2 and the cpp interpreter. This addresses a similar issue as fix "8012144: multiple SIGSEGVs fails on staxf" for taskqueue.hpp. Further this change contains a fix that assures that volatile fields written in constructors are visible before the reference gets published. Looking at the code, we found a MemBarRelease that to us, seems too strong. We think in parse1.cpp doexits() a MemBarStoreStore should suffice. What do you think? Please review and test this change. Best regards, Goetz.



More information about the hotspot-dev mailing list