RFR: AArch64 -- patch to make volatile reads and stores use ldar/stlr (original) (raw)

Andrew Dinn adinn at redhat.com
Wed Apr 22 09:00:08 UTC 2015


Hi Vladimir,

Thanks for responding.

On 21/04/15 18:03, Vladimir Kozlov wrote:

Would it help if we add a new flag to NodeFlags to indicate volatile memory node?

Yes, I think this would provide some benefit although I don't know if it will justify cost of implementing and testing the necessary shared code changes given that the benefit is only partial. Even with such a flag we would still need to do some of the work that this patch does. Also, the flag would not help with a the next optimization we are planning (removing unnecessary barriers from CAS sequences).

Your suggestion will /simplify/ the predicates for LoadX/StoreX but will not make much difference to execution time. The current predicates for these nodes, needs_acquiring_load() and needs_releasing_store(), check immediately whether a LoadX is acquiring or a StoreX is releasing. So, we have a quick filter for the /common path/ case where we simply generate an ldr or stlr (I assume volatile ops are relatively rare).

However, we cannot rely solely on these existing flags as they are sometimes set in other (non-volatile) cases. Currently, if the acquiring or releasing flag is set the predicates go on to investigate the nodes around the LoadX/StoreX looking for a specific configuration of MemBar_XXX nodes. Adding the volatile flag would remove the need to do this extra work in the /uncommon/ case.

Unfortunately, the other half of the patch involves eliding generation of dmb instructions for the MemBar_XXX nodes associated with the a LoadX or StoreX. Three other predicates are used for this purpose, unnecesary_membar_acquire, unnecesary_membar_release and unnecesary_membar_volatile. They search the ideal graph around the membar to see whether it belongs to a signature sequence containing an acquiring load or releasing store.

The check involves i) finding a related LoadX or StoreX, ii) checking it is acquiring or releasing and ii) ensuring it is either tailed or bracketed by a node group with a 'signature' shape which includes the MemBar we started from. A volatile flag would not really speed up this check and would not do much to simplify it.

The current predicates test for very specific signature configurations. So, in cases where the MemBar node is /not/ part of a signature volatile sequence they detect this very rapidly. In the common case the predicate immediately fails to find the required sort of parent/child node in the tree surrounding the initial MemBar (i.e. one which is not a LoadX/StoreX, Proj or MemBar). If we do arrive at a LoadX/StoreX in the node tree we could, perhaps, check the volatile flag and short-circuit any further graph traversal but the bulk of the work involves traversing from the initial MemBar to the LoadX/StoreX.

n.b. I say perhaps because I am not yet certain that we just test the volatile flag. I don't know for sure that we could not arrive at the memory node from a MemBar which was not part of the signature sequence (I /am/ very confident that the current checks correctly identify volatile node sequences and only those sequences).

Could we punt this question of providing the volatile flag, considering it as a potential follow-up improvement? I'd prefer to get this AArch64-only change in as is -- or, at least, with some tweaking -- before going on to consider something which requires changes to shared code.

Also, I mentioned the CAS case above because that is what I am looking at next and it extends the technique used in the current patch. Currently CAS operations translate to something which looks like this

dmb ... retry: ldaxr ... stlxr cbz retry ... dmb

The ldaxr/stlxr pair are generated in response to a CompareAndSwapX node and the dmb instructions arise from MemBar nodes. For a CAS AArch64 can safely elide the dmb instructions. So, the next improvement will be to extend the unnecessary_membar_xxx predicates to elide the dmb generation when a suitable configuration of MemBar_XXX and CompareAndSwapX nodes is seen. Of course, this test will not be helped at all by provision of a volatile flag.

regards,

Andrew Dinn

Senior Principal Software Engineer Red Hat UK Ltd Registered in UK and Wales under Company Registration No. 3798903 Directors: Michael Cunningham (USA), Matt Parson (USA), Charlie Peters (USA), Michael O'Neill (Ireland)



More information about the hotspot-dev mailing list