[9] RFR(S): 8075324: Costs of memory operands in aarch64.ad are inconsistent (original) (raw)

Andrew Dinn adinn at redhat.com
Tue Mar 17 15:15:03 UTC 2015


On 17/03/15 12:42, Tobias Hartmann wrote:

Hi,

please review the following patch. https://bugs.openjdk.java.net/browse/JDK-8075324 http://cr.openjdk.java.net/~thartmann/8075324/webrev.00/

I agree one of these costs is wrong (indOffI) but otherwise I think the costs are correct. That also means I think your addition of the two new operands is a mistake. I suspect you may have been confused by the rather unfortunate choice of names for the operands -- which are rather misleading. Also, the way the instruction generation is hidden behind some parameterised functions (ref below) makes it hard to see what is happening here. In particular, it makes it look like your operands are doing something that they cannot actually do.

Many of the operand definitions are supposed to have cost 0 because the matched operands can be mapped directly to register arguments for a single ldr/str insn (of the appropriate flavour). These are the cases where

i) there is no indirect register and an offset in a relevant range

or where

ii) where there is an indirect register and zero offset

If the matcher can match an operand from these cases then the cost of the instruction combined with the complexity of the reduced operand (added in automatically by the matching algorithm) should be all that is needed to drive the selection process.

However, in the operand definitions where

iii) there is an indirect register and non-zero offset

encoding of the load/store requires planting an lea (which manifests as an add) to add the offset to the base register followed by a ldr/str which does the scaled indirect load. That's because AArch64 does not provide a scaled indirect with offset address mode.

For these cases the cost of the operand should be INSN_COST in order to account precisely for that extra add and the zero cost of passing the resulting offsetted address into the following ldr/str which does the scaled indirect load. There is no need to prefer this sequence. If two separate instructions can do this cheaper or at the same cost then that is fine.

n.b. your newly provided operand format definition which prints

ldrsbw R0, R10, R1, #16 I2L # byte

exemplifies the problem here. There is no instruction which does a scaled indirect load with offset in the AArch64 instruction. If you take look at the static methods named loadStore defined in the source block you can see how this instruction will actually be generated. PrintAssembly will reveal that it gets translated as follows

add     R8, R10, #16
ldrsbw  R0, R1 sxtw

Let's look at some cases:

You are right that indOffI and indOffL should have the same cost but that cost should be 0 because they can both be encoded with a single ldr/str using only a base register and offset with no indirect register.

ldr(dst, base, disp)

By constrast, indIndexScaledI2L is correct to have cost 0. It has a scaled index register but offset 0. So, it can be encoded using a single ldr/str taking a scaled indirect register as argument.

ldr(dst, base, index, SXTW)

Similarly, indIndexScaledOffsetL should have cost INSN_COST because it will cause planting of an add followed by some flavour of ldr/str.

add(rscratch1, base, disp) ldr(dst, index, LSL(0))

So, the costs should be as follows

Cost 0:

indirect indIndexScaledI2L indIndexScaled indIndex indOffI indOffL indirectN indIndexScaledI2LN indIndexScaledN indIndexN indOffIN indOffLN

Cost INSN_COST:

indIndexScaledOffsetI indIndexScaledOffsetL indIndexScaledOffsetI2L indIndexScaledOffsetI2LN

Your two extra operands are not needed.

regards,

Andrew Dinn



More information about the hotspot-compiler-dev mailing list