[9] RFR(S): 8075136: Unnecessary sign extension for byte array access (original) (raw)
Tobias Hartmann tobias.hartmann at oracle.com
Wed Mar 18 09:51:21 UTC 2015
- Previous message: [9] RFR(S): 8075136: Unnecessary sign extension for byte array access
- Next message: [9] RFR(S): 8075136: Unnecessary sign extension for byte array access
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Andrew,
thanks a lot for clarifying this. Please see comments inline.
On 17/03/15 16:11, Andrew Dinn wrote:
Apologies for jumping the gun wiht my previous comment. Having reviewed the earlier traffic in this thread I now understand why the two new rules are needed (i.e. to deal with the case where there is no LShift in the matched operand).
Actually, the original intention of this change was to get rid of unnecessary sign extensions if the ConvI2LNode is not needed. For example, on x86 we avoid a sign extension if idx is guaranteed to be >= 0. See 'indPosIndexScaleOffset' in x86_64.ad:
operand indPosIndexScaleOffset(any_RegP reg, immL32 off, rRegI idx, immI2 scale) %{ constraint(ALLOC_IN_RC(ptr_reg)); predicate(n->in(2)->in(3)->in(1)->as_Type()->type()->is_long()->_lo >= 0);
[...]
On x86 this optimization was missing for the unscaled case (for example, a byte array access). That's why I added this case to the arm64 code.
I assumed that we do the same optimization on arm64 but now it seems to me that we don't. See the following output with the 'baseline' version:
Byte array access: 0x0000007f7453d8b4: add x10, x10, w1, sxtw 0x0000007f7453d8b8: ldrsb w0, [x10,#16]
Char array access: 0x0000007f7cb6f8b4: add x8, x10, #0x10 0x0000007f7cb6f8b8: ldrh w0, [x8,w1,sxtw #1]
In both cases we have a 'sxtw' which is not needed because we have range checks that guarantee that index (w1) is always >= 0.
The two new rules both specify an unscaled 32-bit integer register as index and, of course, the scale value is installed as 0 in the rule. So, the correct patch to function storeLoad is to include these two switch cases in the set which specify a sign-extend:
. . . switch (opcode) { case INDINDEXSCALEDOFFSETI2L: case INDINDEXSCALEDI2L: case INDINDEXSCALEDOFFSETI2LN: case INDINDEXSCALEDI2LN: + case INDINDEXOFFSETI2L: + case INDINDEXOFFSETI2LN: scale = Address::sxtw(size); break; default: scale = Address::lsl(size); . . .
Adapting the code as you suggested has no effect on the generated assembly code. It is equal to the baseline version. In fact, the indIndexOffsetI2L operand is not used.
0x0000007fa91c06b4: add x10, x10, w1, sxtw 0x0000007fa91c06b8: ldrsb w0, [x10,#16]
http://cr.openjdk.java.net/~thartmann/8075136/webrev.04/
I assume that the changes are not required, since we don't do the 'unnecessary sign extension optimization'. What do you think?
Thanks, Tobias
- Previous message: [9] RFR(S): 8075136: Unnecessary sign extension for byte array access
- Next message: [9] RFR(S): 8075136: Unnecessary sign extension for byte array access
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the hotspot-compiler-dev mailing list