RFR: JDK-8184947:,ZipCoder performance improvements (original) (raw)

Xueming Shen xueming.shen at oracle.com
Mon Dec 11 04:08:54 UTC 2017


On 12/10/17, 3:12 PM, Claes Redestad wrote:

Hi Sherman,

On 2017-12-09 00:09, Xueming Shen wrote: Hi,

Please help review the changes for j.u.z.ZipCoder/JDK-8184947 (which also includes cleanup/improvement work in java.lang.StringCoding.java to speed up general String coding performance, especially for UTF8). issue: https://bugs.openjdk.java.net/browse/JDK-8184947 webrev: http://cr.openjdk.java.net/~sherman/8184947/webrev I've not fully reviewed this yet, but something struck me halfway through: As the ASCII fast-path is what's really important here, we could write that part without ever having to go via a StringCoding.Result. On four of your ZipCodingBM micros this improves performance a bit further (~10%): diff -r 848591d85052 src/java.base/share/classes/java/lang/StringCoding.java --- a/src/java.base/share/classes/java/lang/StringCoding.java Sun Dec 10 18:48:21 2017 +0100 +++ b/src/java.base/share/classes/java/lang/StringCoding.java Sun Dec 10 18:55:38 2017 +0100 @@ -937,7 +937,13 @@ * Throws iae, instead of replacing, if malformed or unmmappble. */ static String newStringUTF8NoRepl(byte[] bytes, int off, int len) { - Result ret = decodeUTF8(bytes, off, len, false); + if (COMPACTSTRINGS && !hasNegatives(bytes, off, len)) { + return new String(Arrays.copyOfRange(bytes, off, off + len), LATIN1); + } + Result ret = decodeUTF80(bytes, off, len, false); return new String(ret.value, ret.coder); }

Hi Claes,

You're definite right on this one. We dont need the Result for the ASCII case. webrev has been updated accordingly.

http://cr.openjdk.java.net/~sherman/8184947/webrev

Yes, understood the threadlocal might not be the best choice here. But just feel something need to be done for the temporary Result object after observed its usage with the jfr in #8184947. It is taking as many spaces as the overall String objects do. Sure, it's in young-gen, should be wiped with quickly. But my take is it might be worth the tradeoff of having each/every new String/cs) get a little slower instead of having the "global" vm has to do some extra clean up for this extra Result object, for now.

thanks, sherman



More information about the core-libs-dev mailing list