[9] RFR(S): 8075136: Unnecessary sign extension for byte array access (original) (raw)
Tobias Hartmann tobias.hartmann at oracle.com
Tue Mar 17 12:33:50 UTC 2015
- Previous message: [9] RFR(S): 8075136: Unnecessary sign extension for byte array access
- Next message: [9] RFR(S): 8075136: Unnecessary sign extension for byte array access
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 13.03.2015 19:03, Vladimir Kozlov wrote:
Looks good. What about other platforms? I see the same optimization in aarch64.ad
Thanks, Vladimir. I fixed the issue on ARM64 as well. Looks like other platforms are not affected.
http://cr.openjdk.java.net/~thartmann/8075136/webrev.03/
While verifying, I noticed that the costs of memory operands in aarch64.ad are inconsistent:
The following operands have cost 0:
- indirect
- indIndexScaledI2L
- indIndexScaled
- indIndex
- indOffL
- indirectN
- indIndexScaledOffsetIN
- indIndexScaledI2LN
- indIndexScaledN
- indIndexN
- indOffIN
- indOffLN
Whereas the following operands have cost 'INSN_COST':
- indIndexScaledOffsetI
- indIndexScaledOffsetL
- indIndexScaledOffsetI2L
- indOffI
- indIndexScaledOffsetLN
- indIndexScaledOffsetI2LN
This does not make sense to me. In this case 'AddP (AddP reg (ConvI2L ireg)) off' is matched to 'addP_reg_reg_ext' (cost INSN_COST) and 'indOffL' (cost 0) instead of the newly added 'indIndexOffsetI2L' (cost INSN_COST) because the sum of the costs is equal.
I filed JDK-8075324 [1] to take care of this. I would suggest that we set the cost of a memory operand to either 0 (if only immediates) or 10 (if more complex) like we do on x86.
I verified that the patch (together with the fix for JDK-8075324) solves the issue.
Before: 034 B3: # N1 <- B2 Freq: 0.999998 034 + add R10, R10, R1, sxtw # ptr 038 + ldrsbw R0, [R10, #16] # byte After: 034 B3: # N1 <- B2 Freq: 0.999998 034 + ldrsbw R0, R10, R1, #16 I2L # byte
Thanks, Tobias
[1] https://bugs.openjdk.java.net/browse/JDK-8075324
Thanks, Vladimir
On 3/13/15 7:15 AM, Tobias Hartmann wrote: Hi,
please review the following patch. https://bugs.openjdk.java.net/browse/JDK-8075136 http://cr.openjdk.java.net/~thartmann/8075136/webrev.00/ Problem: C2 adds an unnecessary 'movslq' sign extension while compiling a byte array access: public static byte accessByte(int index) { return byteArr[index]; } 0x00007f76f4747208: movslq %esi,%r11 0x00007f76f474720b: movsbl 0x10(%r10,%r11,1),%eax Where the 'movslq' is not necessary because we emit range checks guaranteeing that index %esi is not negative. For a char array access no such sign extension is created: public static char accessChar(int index) { return charArr[index]; } 0x00007fab3916b188: movzwl 0x10(%r10,%rsi,2),%eax This is because we only have matching rules to fold the corresponding ConvI2LNode in the following case: match(AddP (AddP (DecodeN reg) (LShiftL (ConvI2L idx) scale)) off); This applies to charAccess [1] but not to byteAccess [2] because the byte array access does not require a scaling factor. Solution: I added the corresponding matching rule to fold the ConvI2LNode. match(AddP (AddP (DecodeN reg) (ConvI2L idx)) off); Testing: - Testcase - JPRT Thanks, Tobias [1] https://bugs.openjdk.java.net/secure/attachment/26108/accessChar.png [2] https://bugs.openjdk.java.net/secure/attachment/26107/accessByte.png
- Previous message: [9] RFR(S): 8075136: Unnecessary sign extension for byte array access
- Next message: [9] RFR(S): 8075136: Unnecessary sign extension for byte array access
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the hotspot-compiler-dev mailing list