Dismal performance of String.intern() (original) (raw)

Steven Schlansker stevenschlansker at gmail.com
Mon Jun 10 18:06:14 UTC 2013


Hi core-libs-dev,

While doing performance profiling of my application, I discovered that nearly 50% of the time deserializing JSON was spent within String.intern(). I understand that in general interning Strings is not the best approach for things, but I think I have a decent use case -- the value of a certain field is one of a very limited number of valid values (that are not known at compile time, so I cannot use an Enum), and is repeated many millions of times in the JSON stream.

I discovered that replacing String.intern() with a ConcurrentHashMap improved performance by almost an order of magnitude.

I'm not the only person that discovered this and was surprised: http://stackoverflow.com/questions/10624232/performance-penalty-of-string-intern

I've been excited about starting to contribute to OpenJDK, so I am thinking that this might be a fun project for me to take on and then contribute back. But I figured I should check in on the list before spending a lot of time tracking this down. I have a couple of preparatory questions:

I'm sure that if I get anywhere with this I will have more questions, but this should get me started. Thank you for any advice / insight you may be able to provide!

Steven



More information about the core-libs-dev mailing list