RFR (S): CR 8005926: (thread) Merge ThreadLocalRandom state into java.lang.Thread (original) (raw)

Aleksey Shipilev aleksey.shipilev at oracle.com
Thu Jan 10 22:31:31 UTC 2013


Hi,

Submitting this on behalf of Doug Lea. The webrev is here: http://cr.openjdk.java.net/~shade/8005926/webrev.00/

Bottom-line: merge ThreadLocalRandom state into Thread, to optimize many use cases around j.u.c.* code. The simple performance tests on 2x2 i5, Linux x86_64, 4 threads, 5 forks, 3x3s warmup, 5x3s measurement:

JDK8 (baseline) TLR.nextInt(): 6.4 +- 0.1 ns/op TLR.current().nextInt(): 16.1 +- 0.4 ns/op TL.get().nextInt(): 19.1 +- 0.6 ns/op

JDK8 (patched) TLR.nextInt(): 6.5 +- 0.2 ns/op TLR.current().nextInt(): 6.4 +- 0.1 ns/op TL.get().nextInt(): 17.2 +- 2.0 ns/op

First line shows the peak performance of TLR itself, everything over that is the ThreadLocal overhead. One can see the patched version bypasses ThreadLocal machinery completely, and the overhead is slim to none.

N.B. It gets especially interesting when there are many ThreadLocals registered. Making 1M ThreadLocals and pre-touching them bloats the thread-local maps, and we get:

JDK8 (baseline), contaminators = 1M: TLR.nextInt(): 6.4 +- 0.1 ns/op TLR.current().nextInt(): 21.7 +- 5.3 ns/op TL.get().nextInt(): 28.7 +- 1.1 ns/op

JDK8 (patched), contaminators = 1M: TLR.nextInt(): 6.6 +- 0.2 ns/op TLR.current().nextInt(): 6.5 +- 0.1 ns/op TL.get().nextInt(): 29.4 +- 0.5 ns/op

Note that patched version successfully dodges this pathological case.

Testing:

Attribution:

-Aleksey.



More information about the core-libs-dev mailing list