hg: jdk8/tl/jdk: 6924259: Remove offset and count fields from java.lang.String (original) (raw)

Peter Levart peter.levart at gmail.com
Thu Nov 15 22:49:34 UTC 2012


Hi,

This change is 6 months old now. I wonder if Oracle received any complaints from the users since then. I mean complaints that are based on real observations of performance degradation in real code - not only speculation.

Regards, Peter

2012/11/15 Zhong Yu <zhong.j.yu at gmail.com>

Since this change is to achieve minor performance boost, it's not fair to defend it by saying that it only incurs minor performance penalties.

Java programs are infested with strings, most of which could have used a more appropriate type, but it is the insane reality. Any change to the behavior of strings should have been backed up by a much more thorough analysis. Every usage of substring() was (hopefully) the result of some conscious reasoning about space-time. Even if this change does not significantly alter an application's performance, it invalidates all the reasoning, that's the worst blow in my book. There's no problem if substring() does copying from day one, but 17 years have passed. Zhong Yu On Wed, Nov 14, 2012 at 6:58 PM, Vitaly Davidovich <vitalyd at gmail.com> wrote: > Personally, I feel like the concern is a bit overstated: > > 1) the n in O(n) is likely actually fairly small in practice (at least in > what I'd consider sane code) > 2) I think a lot of people that worry about perf probably aren't using > substring() anyway > 3) copying char[] is optimized by jit - this is basically a memcpy()-like > call, which modern machines handle well > 4) the upside is strings are 8 bytes smaller > 5) .NET substring() has always allocated new storage (via an optimized > internal VM call) and never shared the char[] and I haven't come across any > complaints or seen serious perf problems myself (granted I seldom use > substring) > > So I don't know if this is anything to worry about in practice. > > Sent from my phone > > On Nov 14, 2012 5:26 PM, "Zhong Yu" <zhong.j.yu at gmail.com> wrote: >> >> On 06/03/2012 11:35 PM, Mike Duigou wrote: >> > [I trimmed the distribution list] >> > >> > On Jun 3 2012, at 13:44 , Peter Levart wrote: >> > >> >> On Thursday, May 31, 2012 03:22:35 AM mike.duigou at oracle.comwrote: >> >>> Changeset: 2c773daa825d >> >>> Author: mduigou >> >>> Date: 2012-05-17 10:06 -0700 >> >>> URL: http://hg.openjdk.java.net/jdk8/tl/jdk/rev/2c773daa825d >> >>> >> >>> 6924259: Remove offset and count fields from java.lang.String >> >>> Summary: Removes the use of shared character array buffers by String >> >>> along >> >>> with the two fields needed to support the use of shared buffers. >> >> Wow, that's quite a change. >> > Indeed. It was a long time in development. It is a change which is >> > expected to be overall beneficial though and in the general case a positive >> > win. >> >> Wow! >> >> If the previous behavior of substring() was once a bug, by now it has >> become a well known feature. People know about it, and people depend >> on it. >> >> This change is a big surprise. Changing O(1) to O(n) is a breach of >> contract. It'll break lots of old code; and meanwhile lots of new code >> are still being written based on the old assumption. After people >> learned about the new behavior, they need to comb through and rewrite >> their code. >> >> The worst part is the same code performs very differently on different >> versions of JDK. What's a programmer supposed to do if his code >> targets JDK6 and above? If the cost of strings are no longer certain, >> what else can we believe in? >> >> Is there any chance in hell to roll it back? Maybe add a new method >> for the new behavior? >> >> Zhong Yu



More information about the core-libs-dev mailing list