RFR: String Density/Compact String JEP 254 (update) (original) (raw)

Xueming Shen xueming.shen at oracle.com
Fri Oct 30 21:30:23 UTC 2015


Hi,

Thanks for the comments/suggestions. Here are the updated webrevs with minor changes here and there based on the feedback.

http://cr.openjdk.java.net/~sherman/8054307/jdk/ http://cr.openjdk.java.net/~thartmann/compact_strings/webrev/hotspot/

[closed, Oracle internal only] http://javaweb.us.oracle.com/~tohartma/compact_strings/hotspot/ http://javaweb.us.oracle.com/~tohartma/compact_strings/hotspot_test_closed/

The code is ready for integration. The current plan is to integrate via the hotspot repo in coming week if it passes the PIT.

Thanks -Sherman

On 10/5/15 8:30 AM, Xueming Shen wrote:

(resent to hotspot-dev at openjdk.java.net)

Hi, Please review the change for JEP 254/Compact String project. JPE 254: http://openjdk.java.net/jeps/254 Issue: https://bugs.openjdk.java.net/browse/JDK-8054307 Webrevs: http://cr.openjdk.java.net/~sherman/8054307/jdk/ http://cr.openjdk.java.net/~thartmann/compactstrings/webrev/hotspot Description: String Density project is to change the internal representation of the String class from a UTF-16 char array to a byte array plus an encoding flag field. The new String class stores characters encoded either as ISO-8859-1/Latin-1 (one byte per character), or as UTF-16 (two bytes per character), based upon the contents of the string. The encoding flag indicates which encoding is used. It offers reduced memory footprint while maintaining throughput performance. See JEP 254 for more additional information Implementation repo/try out: http://hg.openjdk.java.net/jdk9/sandbox/ branch: JDK-8054307-branch $ hg clone http://hg.openjdk.java.net/jdk9/sandbox/ $ cd sandbox $ sh ./getsource.sh $ sh ./common/bin/hgforest.sh up -r JDK-8054307-branch $ make configure $ make images Implementation Notes: - To change the internal representation of the String and the String builder classes (AbstractStringBuilder, StringBuilder and StringBuffer) from a UTF-16 char array to a byte array plus an encoding flag field. The new representation stores the String characters in a single byte format using the lower 8-bit of character's 16-bit UTF16 value, and sets the encoding flag as LATIN1, if all characters of the String object are Unicode Latin1 characters (with its UTF16 value < \u0100)_ _It stores the String characters in 2-byte format with their UTF-16_ _value_ _and sets the flag as UTF16, if any of the character inside the String_ _object is NOT Unicode latin1 character._ _- To change the method implementation of the String class and its_ _builders_ _to function on the new internal character storage, mainly to_ _delegate to_ _two implementation classes StringUTF16 and StringLatin1_ _- To update the StringCoding class to decoding/encoding the String_ _between_ _String.byte[]/coder(LATIN1/UTF16) <-> byte[](native encoding) instead of the original String.char[] <-> byte[] (native encoding) - To update the hotSpot compiler (new and updated instrinsics), GC (String Deduplication mods) and Runtime to work with the new internal "byte[] + coder flag" representation. See Tobias's note for details of the hotspot changes: http://cr.openjdk.java.net/~thartmann/compactstrings/hotspot-impl-note - To add a vm option "CompactStrings" (default is true) to provide a switch-off mechanism to always store the String characters in UTF16 encoding (always 2 bytes, but still in a byte[], instead of the original char[]).

Supporting performance artifacts: - Report(s) on memory footprint impact http://cr.openjdk.java.net/~shade/density/string-density-report.pdf Latest SPECjbb2005 footprint reduction and throughput numbers for both Intel (Linux) and SPARC, in which it shows the Compact String binaries use less memory and have higher throughput. latest:http://cr.openjdk.java.net/~sherman/8054307/specjbb2005 old: http://cr.openjdk.java.net/~huntch/string-density/reports/String-Density-SPARC-jbb2005-Report.pdf - Throughput performance impact via String API micro-benchmarks http://cr.openjdk.java.net/~thartmann/compactstrings/microbenchmarks/Haswell090915.pdf http://cr.openjdk.java.net/~thartmann/compactstrings/microbenchmarks/IvyBridge090915.pdf http://cr.openjdk.java.net/~thartmann/compactstrings/microbenchmarks/Sparc090915.pdf http://cr.openjdk.java.net/~sherman/8054307/string-coding.txt Thanks, Sherman



More information about the core-libs-dev mailing list