Bit set intrinsic (original) (raw)
B. Blaser bsrbnd at gmail.com
Mon Nov 5 21:21:02 UTC 2018
- Previous message: Possible open file leak in com.sun.tools.javac.file.JavacFileManager
- Next message: Bit set intrinsic
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Wed, 31 Oct 2018 at 15:51, B. Blaser <bsrbnd at gmail.com> wrote:
The last but not least, I implemented the c2 part (using the 8-bit AND/OR variant) to do sharper comparisons also on non-concurrent execution: http://cr.openjdk.java.net/~bsrbnd/boolpack/webrev.02/ With 10e6 iterations the lock latency seems to be more or less negligible and removing it would make the intrinsic about 10% faster than BitSet without synchronization.
Which actually seems to be due to the following missing ANDB/ORB patterns in x86_64.ad:
instruct andB_mem_rReg(memory dst, rRegI src, rFlagsReg cr) %{ match(Set dst (StoreB dst (AndI (LoadB dst) src))); effect(KILL cr);
ins_cost(150); format %{ "andb dst,dst, dst,src\t# byte" %} opcode(0x20); ins_encode(REX_breg_mem(src, dst), OpcP, reg_mem(src, dst)); ins_pipe(ialu_mem_reg); %}
instruct orB_mem_rReg(memory dst, rRegI src, rFlagsReg cr) %{ match(Set dst (StoreB dst (OrI (LoadB dst) src))); effect(KILL cr);
ins_cost(150); format %{ "orb dst,dst, dst,src\t# byte" %} opcode(0x08); ins_encode(REX_breg_mem(src, dst), OpcP, reg_mem(src, dst)); ins_pipe(ialu_mem_reg); %}
The next two lines:
- bits[index>>>3] |= (byte)(1 << (index & 7));
- bits[index>>>3] &= (byte)~(1 << (index & 7));
where assembled as: 1) 024 movsbl R8, [RSI + #16 + R10] # byte 02a movl R11, #1 # int 030 sall R11, RCX 033 movsbl R11, R11 # i2b 037 orl R11, R8 # int 03a movb [RSI + #16 + R10], R11 # byte 2) 024 movsbl R8, [RSI + #16 + R10] # byte 02a movl R11, #1 # int 030 sall R11, RCX 033 not R11 036 movsbl R11, R11 # i2b 03a andl R8, R11 # int 03d movb [RSI + #16 + R10], R8 # byte
instead of: 1) 024 movl R11, #1 # int 02a sall R11, RCX 02d movsbl R11, R11 # i2b 031 orb [RSI + #16 + R10], R11 # byte 2) 024 movl R11, #1 # int 02a sall R11, RCX 02d not R11 030 movsbl R11, R11 # i2b 034 andb [RSI + #16 + R10], R11 # byte
So, as first step, I would probably create a JBS issue and send out a RFR on hotspot-dev for this simple enhancement if there are no objections?
Any opinion is welcome.
Thanks, Bernard
- Previous message: Possible open file leak in com.sun.tools.javac.file.JavacFileManager
- Next message: Bit set intrinsic
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]