RFR (S): JEP-142: Reduce Cache Contention on Specified Fields (original) (raw)
Aleksey Shipilev aleksey.shipilev at oracle.com
Sat Nov 24 00:24:55 PST 2012
- Previous message: RFR (S): JEP-142: Reduce Cache Contention on Specified Fields
- Next message: RFR (S): JEP-142: Reduce Cache Contention on Specified Fields
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 11/23/2012 01:33 AM, Aleksey Shipilev wrote:
Hi,
After some internal discussions with Doug Lea, Dave Dice and others, I would like to solicit the initial feedback on the implementation of JEP-142, aka @Contended [1]: http://openjdk.java.net/jeps/142 The webrev for the initial version is here: http://shipilev.net/pub/jdk/hotspot/contended/webrev-2/
BTW, here's the simple microbenchmark test: four threads T1..T4 on my 2x2 i5-2520M each increments its distinct field i1..i4:
static class Base {
@Contended long i1;
@Contended long i2;
@Contended long i3;
@Contended long i4;
}
Running with proper warmups, 10 measurement iterations for 1 second each, 10 JVM invocations per test, yields:
-XX:FieldPaddingWidth=0: 7.94 +- 1.10 nsec/op -XX:FieldPaddingWidth=8: 4.92 +- 0.53 nsec/op -XX:FieldPaddingWidth=16: 4.67 +- 0.54 nsec/op -XX:FieldPaddingWidth=32: 4.67 +- 0.31 nsec/op -XX:FieldPaddingWidth=64: 3.55 +- 0.03 nsec/op -XX:FieldPaddingWidth=128: 3.54 +- 0.03 nsec/op
(The time is severely bloated because of hyperthreading and other infra overheads).
So that's at least two-fold difference even within single CPU package, where false sharing misses are being served by on-chip L3. It will get dramatically worse on multi-CPU hosts. "width=0" corresponds to our current behavior, i.e. dense packing. Note also the jitter is significantly better since we are not at the mercy of execution interleavings.
Thanks, -Aleksey.
- Previous message: RFR (S): JEP-142: Reduce Cache Contention on Specified Fields
- Next message: RFR (S): JEP-142: Reduce Cache Contention on Specified Fields
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]