Optimizing UUID#fromString(String) (original) (raw)

Claes Redestad claes.redestad at oracle.com
Sun Jan 28 23:19:43 UTC 2018


Hi Jon,

On 2018-01-27 17:05, Jon Chambers wrote:

Regardless, I wanted to call this optimization opportunity to your attention, and would be happy to offer a proper patch if this seems like a worthwhile change.

this does looks promising - and at least points out there might be some performance issues lurking here!

Found another compatibility issue, however: The current UUID.fromString allow use of any unicode digits parseable into digits. Say, Arabic digits (U+0660 - U+0669):

jshell> UUID.fromString("\u0669-\u0669-\u0669-\u0669-\u0669"); $2 ==> 00000009-0009-0009-0009-000000000009

That you're getting some improvement from avoiding Character.digit(..) stands out in a way, so I wondered if there's something going on here... and it does seem that this method could be optimized by providing a fast-path in the CharacterDataLatin1 plane...

I experimented with a few solutions, and the best I could find was a 256 element byte[] lookup table for digits in CharacterDataLatin1 (spanning the entire 8-bit space of CharacterDataLatin1 allows us to avoid a branch):

http://cr.openjdk.java.net/~redestad/scratch/char.digit.00/

This speeds up your micro testing java.util.UUID (UUIDBenchmark.benchmarkUUIDFromString) by ~1.6x. The improvement is there on all counters when looking at -prof perfnorm (less loads, stores, branches, instructions, better CPI etc)[1], so seems like a pretty straightforward thing to consider independently.

Your FastUUID implementation is still quite a bit faster, though, but it'd be good to dig around for a bit to see if there are other more general improvements like this to be had..

Thanks!

/Claes

[1] Baseline: Benchmark                                                    Mode Cnt     Score   Error  Units UUIDBenchmark.benchmarkUUIDFromString                        avgt 4     0.505 ± 0.066  us/op UUIDBenchmark.benchmarkUUIDFromString:CPI avgt          1.024           #/op UUIDBenchmark.benchmarkUUIDFromString:L1-dcache-load-misses avgt          3.000           #/op UUIDBenchmark.benchmarkUUIDFromString:L1-dcache-loads avgt        211.128           #/op UUIDBenchmark.benchmarkUUIDFromString:L1-dcache-stores avgt         34.894           #/op UUIDBenchmark.benchmarkUUIDFromString:L1-icache-load-misses avgt          0.067           #/op UUIDBenchmark.benchmarkUUIDFromString:LLC-load-misses avgt          0.103           #/op UUIDBenchmark.benchmarkUUIDFromString:LLC-loads avgt          0.812           #/op UUIDBenchmark.benchmarkUUIDFromString:LLC-store-misses avgt          0.004           #/op UUIDBenchmark.benchmarkUUIDFromString:LLC-stores avgt          0.054           #/op UUIDBenchmark.benchmarkUUIDFromString:branch-misses avgt         13.269           #/op UUIDBenchmark.benchmarkUUIDFromString:branches avgt        301.931           #/op UUIDBenchmark.benchmarkUUIDFromString:cycles avgt       1585.057           #/op UUIDBenchmark.benchmarkUUIDFromString:dTLB-load-misses avgt          0.005           #/op UUIDBenchmark.benchmarkUUIDFromString:dTLB-loads avgt        208.788           #/op UUIDBenchmark.benchmarkUUIDFromString:dTLB-store-misses avgt         ≈ 10⁻⁴           #/op UUIDBenchmark.benchmarkUUIDFromString:dTLB-stores avgt         30.540           #/op UUIDBenchmark.benchmarkUUIDFromString:iTLB-load-misses avgt          0.002           #/op UUIDBenchmark.benchmarkUUIDFromString:iTLB-loads avgt          0.006           #/op UUIDBenchmark.benchmarkUUIDFromString:instructions avgt       1547.568           #/op

Results with http://cr.openjdk.java.net/~redestad/scratch/char.digit.00/ -- : Benchmark                                                    Mode Cnt     Score   Error  Units UUIDBenchmark.benchmarkUUIDFromString                        avgt 4     0.310 ± 0.022  us/op UUIDBenchmark.benchmarkUUIDFromString:CPI avgt          0.832           #/op UUIDBenchmark.benchmarkUUIDFromString:L1-dcache-load-misses avgt          2.026           #/op UUIDBenchmark.benchmarkUUIDFromString:L1-dcache-loads avgt        208.227           #/op UUIDBenchmark.benchmarkUUIDFromString:L1-dcache-stores avgt         30.673           #/op UUIDBenchmark.benchmarkUUIDFromString:L1-icache-load-misses avgt          0.052           #/op UUIDBenchmark.benchmarkUUIDFromString:LLC-load-misses avgt          0.042           #/op UUIDBenchmark.benchmarkUUIDFromString:LLC-loads avgt          1.027           #/op UUIDBenchmark.benchmarkUUIDFromString:LLC-store-misses avgt          0.007           #/op UUIDBenchmark.benchmarkUUIDFromString:LLC-stores avgt          0.044           #/op UUIDBenchmark.benchmarkUUIDFromString:branch-misses avgt          0.040           #/op UUIDBenchmark.benchmarkUUIDFromString:branches avgt        255.547           #/op UUIDBenchmark.benchmarkUUIDFromString:cycles avgt        981.466           #/op UUIDBenchmark.benchmarkUUIDFromString:dTLB-load-misses avgt          0.003           #/op UUIDBenchmark.benchmarkUUIDFromString:dTLB-loads avgt        213.797           #/op UUIDBenchmark.benchmarkUUIDFromString:dTLB-store-misses avgt         ≈ 10⁻⁴           #/op UUIDBenchmark.benchmarkUUIDFromString:dTLB-stores avgt         30.888           #/op UUIDBenchmark.benchmarkUUIDFromString:iTLB-load-misses avgt          0.002           #/op UUIDBenchmark.benchmarkUUIDFromString:iTLB-loads avgt          0.002           #/op UUIDBenchmark.benchmarkUUIDFromString:instructions avgt       1180.129           #/op



More information about the core-libs-dev mailing list