Improving performance and reducing object allocations of java.util.UUID to/from string (original) (raw)

Steven Schlansker stevenschlansker at gmail.com
Sun Jan 20 00:54:03 UTC 2013


Thank you for the feedback!

On Jan 10, 2013, at 4:50 AM, Aleksey Shipilev <aleksey.shipilev at oracle.com> wrote:

On 01/09/2013 09:51 PM, Steven Schlansker wrote:

Hello again,

I sent this email a week ago and have received no replies. Is there any step I have missed necessary to contribute to the JDK libraries? I think the crucial part is OCA, as per: http://openjdk.java.net/contribute/

I have now taken care of this :-) The OCA is under the company name "Ness Computing" and was emailed yesterday.

I am very interested in making your lives easier, so please let me know if I am in the wrong place or are otherwise misguided. You are at the correct place. On the first glance, the change looks good for the start. A few comments though: a) Do you need the masks before or-ing with most/leastSigBits?

So, they seem to exist to satisfy a few Apache Harmony test cases: http://svn.apache.org/viewvc/harmony/enhanced/java/branches/java6/classlib/modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/util/UUIDTest.java?revision=929252

For example, uuid = UUID.fromString("123456789-0-0-0-0"); Assert.assertEquals(0x2345678900000000L, uuid.getMostSignificantBits()); Assert.assertEquals(0x0L, uuid.getLeastSignificantBits());

Without the masking, the values can smear together. It would be possible to avoid converting extra characters, but I believe that the mask is probably cheaper than an if to check for the presence of extra characters.

Were I writing the code from scratch I would throw an IllegalArgumentException on such a UUID but I assume that is not acceptable for compatibility reasons.

b) Is there a more standard (and still performant) way to do the hex conversion? Look around JDK source, I think there should be something else needing the same kind of conversion.

I didn't see any candidates. I thought of at least the following places to look: Long, String, Formatter, BitSet, and did some general browsing around java.lang and java.util with no luck. The Long methods are in java.lang and so could not be used without making them public. I assume that having private helper methods for my class is preferable to introducing new public methods. It's entirely possible I missed something, but I'm not seeing it.

c) I'd go for making utility methods a bit more generic. For one, I would rather make decodeHex(String str, int start, int end), and encodeHex(char[] dest, int offset, int value).

I can do this if you feel strongly, but I believe that this compromises the readability of the code. For example, right now you can see: long mostSigBits = decode(str, dashPos, 0); mostSigBits <<= 16; mostSigBits |= decode(str, dashPos, 1); …

whereas to make it use a "more general" helper it would have to look like this: long mostSigBits = decode(str, dashPos[0]+1, dashPos[1]); mostSigBits <<= 16; mostSigBits |= decode(str, dashPos[1], dashPos[2]+1);

This is an increase from one "magic value" per invocation to three.

It's not a huge difference either way, but I'm not sure that your suggestion is better? I suppose I could make it non-private helper method available to others, although I am not sure if this is considered a good practice in the JDK code. Let me know if you think I should do this.

Microbenchmark glitches: a) % is the integer division, and at the scale of the operations you are measuring, it could incur significant costs; the usual practice is having power-of-2 size, and then (i % size) -> (i & (size - 1)). I fixed this, but it made no meaningful change to the numbers. b) Not sure if you want to stick with random UUIDs for comparisons. While the law of large numbers is on your side, 1000 random UUIDs might be not random enough. I don't believe that this has any bearing on the result. I've run the trials many, many times and the number rarely changes by even 1% from the values that I posted beforehand. If you think this is an actual problem, I can come up with an alternate methodology. I don't think it's a problem.

Thanks again for the review. Please let me know if you agree or believe I am mistaken in my analysis above.



More information about the core-libs-dev mailing list