Unsafe.{get,put}-X-Unaligned performance (original) (raw)

Peter Levart peter.levart at gmail.com
Thu Mar 12 19:29:19 UTC 2015


On 03/12/2015 07:37 PM, Andrew Haley wrote:

On 03/12/2015 05:15 PM, Peter Levart wrote:

...or are JIT+CPU smart enough and there would be no difference? C2 always orders things based on profile counts, so there is no difference. Your suggestion would be better for interpreted code and I guess C1 also, so I agree it is worthwhile. Thanks, Andrew.

What about the following variant (or similar with ifs in case switch is sub-optimal):

 public final long getLongUnaligned(Object o, long offset) {
     switch ((int) offset & 7) {
         case 1:
         case 5: return
             (toUnsignedLong(getByte(o, offset)) << pickPos(56, 0)) |
             (toUnsignedLong(getShort(o, offset + 1)) << pickPos(48, 

8)) | (toUnsignedLong(getInt(o, offset + 3)) << pickPos(32, 24)) | (toUnsignedLong(getByte(o, offset + 7)) << pickPos(56, 56)); case 2: case 6: return (toUnsignedLong(getShort(o, offset)) << pickPos(48, 0)) | (toUnsignedLong(getInt(o, offset + 2)) << pickPos(32, 16)) | (toUnsignedLong(getShort(o, offset + 6)) << pickPos(48, 48)); case 3: case 7: return (toUnsignedLong(getByte(o, offset)) << pickPos(56, 0)) | (toUnsignedLong(getInt(o, offset + 1)) << pickPos(32, 8)) | (toUnsignedLong(getShort(o, offset + 5)) << pickPos(48, 40)) | (toUnsignedLong(getByte(o, offset + 7)) << pickPos(56, 56)); case 4: return (toUnsignedLong(getInt(o, offset)) << pickPos(32, 0)) | (toUnsignedLong(getInt(o, offset + 4)) << pickPos(32, 32)); case 0: default: return getLong(o, offset); } }

...it may have more branches, but less instructions in average per call.

Peter

-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20150312/9f60e551/attachment-0001.html>



More information about the hotspot-compiler-dev mailing list