Improving ThreadLocalRandom (and related classes) (original) (raw)
Aleksey Shipilev aleksey.shipilev at oracle.com
Wed Jan 9 10:55:09 UTC 2013
- Previous message: Improving ThreadLocalRandom (and related classes)
- Next message: Improving ThreadLocalRandom (and related classes)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 01/08/2013 08:33 PM, Doug Lea wrote:
However, the actual ThreadLocalRandom object is padded to avoid memory contention (which wouldn't be necessary or useful if already embedded withing Thread).
I'm tempted to disagree. While it is true most of the callers are accessing Thread in the context of currentThread(), and most of the Thread state is not updated, it can catastrophically break down once we cram in the heavily updated fields.
E.g. this is the java.lang.Thread field layout as of 7u12:
$ java -jar java-object-layout.jar java.lang.Thread Running 64-bit HotSpot VM. Using compressed references with 3-bit shift. Objects are 8 bytes aligned.
java.lang.Thread offset size type description 0 12 (assumed to be the object header
- first field alignment) 12 4 int Thread.priority 16 8 long Thread.eetop 24 8 long Thread.stackSize 32 8 long Thread.nativeParkEventPointer 40 8 long Thread.tid 48 4 int Thread.threadStatus 52 1 boolean Thread.single_step 53 1 boolean Thread.daemon 54 1 boolean Thread.stillborn 55 1 (alignment/padding gap) 56 4 char[] Thread.name 60 4 Thread Thread.threadQ 64 4 Runnable Thread.target 68 4 ThreadGroup Thread.group 72 4 ClassLoader Thread.contextClassLoader 76 4 AccessControlContext Thread.inheritedAccessControlContext 80 4 ThreadLocalMap Thread.threadLocals 84 4 ThreadLocalMap Thread.inheritableThreadLocals 88 4 Object Thread.parkBlocker 92 4 Interruptible Thread.blocker 96 4 Object Thread.blockerLock 100 4 UncaughtExceptionHandler Thread.uncaughtExceptionHandler 104 (object boundary, size estimate)
That means adding a few primitive fields can easily overlap with the fields for another Thread and make the false sharing quite the issue. Padding out the inlined TLR state would save us from this trouble (thankfully, @Contended can make that without the magical field arrangements and finger crossing).
We can @Contended the whole Thread, which means pushing Thread to consume 256 bytes instead of 104+ as it is now. While this seems to be the large increase, it is a global win since padded TLR state is gone, and we effectively hiding the Thread state in the "padding shadow".
My 2c.
-Aleksey.
- Previous message: Improving ThreadLocalRandom (and related classes)
- Next message: Improving ThreadLocalRandom (and related classes)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]