RFR: 8061254: SPECjvm2008-XML performance regressions in 9-b33 (original) (raw)
Vitaly Davidovich vitalyd at gmail.com
Wed May 13 13:11:45 UTC 2015
- Previous message: RFR: 8061254: SPECjvm2008-XML performance regressions in 9-b33
- Next message: RFR: 8061254: SPECjvm2008-XML performance regressions in 9-b33
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Thanks Claes. I think there's some room for improvement in the JIT to piggyback on the user check, but that's a side topic.
Out of curiosity, have you tried moving the slow path (computation of the hash) into an out-of-line method? What's the bytecode size if you do that? This shouldn't matter for C2, but I wonder if it may somehow tickle the JIT/register allocator into better codegen.
On Wed, May 13, 2015 at 9:06 AM, Claes Redestad <claes.redestad at oracle.com> wrote:
I shouldn't have said better. :-)
Running against versions of hashCode using value.length > 0 and h != 0 respectively show results that are within the error range of each other. It's hard to quantify the cost of one extra branch in the slow path, I guess. Checking value.length will only avoid the store for the empty string, but not for other strings with hash==0. For the empty string, we might do one extra compare compare/branch every time we call hashCode, The value.length > 0 check does issue an additional getfield op code for the slow path, avoiding this can be slightly beneficial in some scenarios (even though it shouldn't matter much in the final asm). For reference, this patch (doing if (h != 0) { hash = h; }) nets about a 60x speedup on the micro with -t 32 -p valueString="pollinating sandboxes" on our multi-socket machines over both the 8058643 implementation and the one before it. /Claes
On 2015-05-13 14:53, Vitaly Davidovich wrote: It's interesting that this version performs as good or better than value.length > 0 check. Intuitively, value.length is going to be used anyway (if h == 0) in the loop and so there should be no penalty of loading and branching on it (given the change here has a branch anyway) and compiler can keep that value in a register for the actual loop. Looking at the generated asm for current String.hashCode(), it doesn't look like compiler is actually using the user-written check to skip its own check (see the two tests at the bottom of this snippet): 0x00007f7e312a0d8c: mov %rsi,%rbx 0x00007f7e312a0d8f: mov 0x10(%rsi),%eax ;*getfield hash ; - java.lang.String::hashCode at 1 (line 1454) 0x00007f7e312a0d92: test %eax,%eax 0x00007f7e312a0d94: jne 0x00007f7e312a0e85 ;*ifne ; - java.lang.String::hashCode at 6 (line 1455) 0x00007f7e312a0d9a: mov 0xc(%rsi),%ebp ;*getfield value ; - java.lang.String::hashCode at 10 (line 1455) 0x00007f7e312a0d9d: mov 0xc(%r12,%rbp,8),%r10d ;*arraylength ; - java.lang.String::hashCode at 13 (line 1455) ; implicit exception: dispatches to 0x00007f7e312a0ea9 0x00007f7e312a0da2: test %r10d,%r10d 0x00007f7e312a0da5: jle 0x00007f7e312a0e91 ;*ifle ; - java.lang.String::hashCode at 14 (line 1455) 0x00007f7e312a0dab: test %r10d,%r10d 0x00007f7e312a0dae: jbe 0x00007f7e312a0e95 It does however continue using r10. On Wed, May 13, 2015 at 8:36 AM, Sergey Bylokhov <_ _Sergey.Bylokhov at oracle.com> wrote: Just curious, what is the difference between this fix and an old version of the code:
http://cr.openjdk.java.net/~shade/8058643/webrev.01/src/java.base/share/classes/java/lang/String.java.sdiff.html I meant: "if (h == 0 && value.length > 0) {}" vs "if (h != 0) {hash = h;}"
On 13.05.15 14:51, Claes Redestad wrote: Hi, 9-b33 introduced a sustained regression in SPECjvm2008 xml.transform on a number of our test setups. JDK-8058643 removed the check on value.length > 0, which means repeated calls to "".hashCode() now do a store of the calculated value (0) to the hash field. This has potential to cause excessive cache coherence traffic in xml.transform[1] Extracting the code that showed up in profiles to a micro[2] and running this in multiple threads is up to 100x slower in 9-b33 on a dual-socket machine. Adding back the check that value.length > 0 before entering the calculation loop recuperates the loss in both micro and xml.transform, but we're seeing as good or better number by simply guarding the store: Webrev: http://cr.openjdk.java.net/~redestad/8061254/webrev.00/ Bug: https://bugs.openjdk.java.net/browse/JDK-8061254 (confidential due to links to internal systems, sorry!) This additionally avoids the field store (and potential for pointless cache traffic) also on non-empty strings whose hash value happen to equals 0. Thanks! /Claes [1] See com.sun.org.apache.xml.internal.dtm.ref.ExpandedNameTable#getExpandedTypeID. [2] http://cr.openjdk.java.net/~redestad/8061254/expandedtypeid-micro.zip
-- Best regards, Sergey.
- Previous message: RFR: 8061254: SPECjvm2008-XML performance regressions in 9-b33
- Next message: RFR: 8061254: SPECjvm2008-XML performance regressions in 9-b33
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]