Enhanced String literal for Java: version 1.3 with implementation. (original) (raw)

rssh at gradsoft.com.ua rssh at gradsoft.com.ua
Wed Mar 18 04:14:59 PDT 2009


Good day. Here is next version with enhanced string literals for Java.

Changes are:

Here is next version: 1.3

AUTHOR(s): Ruslan Shevchenko, Reinier Zwitserloot (if agree)

OVERVIEW:

FEATURE SUMMARY: new string literals in java language:

MAJOR ADVANTAGE: Possibility more elegant to code strings from other languages, such as sql constructions or inline xml (for multiline strings) or regular expressions (for string literals without escape processing).

MAJOR DISADVANTAGE Small increasing of language complexity.

ALTERNATIVES:

For multiline strings use operations and concatenation methods, such as:

String,contact("Multiline \n", "string "); or String bigString="First line\n"+ "second line"

For unescaped ('raw') stings - use escaping of ordinary java string.

EXAMPLES

SIMPLE EXAMPLE:

Multiline string:

StringBuilder sb = new StringBuilder(); sb.append("""select a from Area a, CountryCodes cc where cc.isoCode='UA' and a.owner = cc.country """); if (question.getAreaName()!=null) { sb.append("""and a.name like ? """); sqlParams.setString(++i,question.getAreaName()); }

instead: StringBuilder sb = new StringBuilder(); sb.append("select a from Area a, CountryCodes cc\n"); sb.append("where cc.isoCode='UA'\n"); sb.append("and a.owner=cc.country'\n"); if (question.getAreaName()!=null) { sb.append("and a.name like ?"); sqlParams.setString(++i,question.getAreaName()); }

String platformDepended="""q """.nativeLf();

is 'q\n' if run on Unix and 'q\n\r' if run on Windows.

String platformIndepended="""q """; is always "q\n".

String platformIndepended="""q """U.unixLf(); is the same.

String platformIndepended=""" """.windowsLf(); is always '\r\n'.

Unescaped String: String myParrern=''...'';

instead

String myParrern="..\.";

String fname=''C:\Program Files\My Program\Configuration'';

instead

String myParrern="C:\Program Files\My Program\Configuration";

ADVANCED EXAMPLE:

String empty=""" """; is empty.

String foo = """ bar baz bla qux";

is equal to: String foo = "bar\n baz\n bla\nqux";

and the following: String foo = """ foo bar"""; is compiled to "foo\nbar" as compilation warning

String manyQuotes=""""""""""""; is """""

String s = """I'm long string in groovy stile wi
th \ at end of line"""; is: I'm long string in groovy stile with \ at end of line

String s = '' I'm long string in groovy stile wi
th \ at end of line'';

is: I'm long string in groovy stile wi
th \ at end of line

DETAILS:

Multiline strings are part of program text, which begin and ends by three double quotes.

I. e. grammar in 3.10.5 of JLS can be extented as:

MultilineStringLiteral:
        """ MultilineStringCharacters/opt """

MultilineStringCharacters:
        MultilineStringCharacter
        MultilineStringCharacters  (MultilineStringCharacter but not ")
        (MultilineStringCharacters but not "") "

MultilineStringCharacter:
        InputCharacter but not \
        EscapeSequence
        LineTermination
        EolEscapeSequence

EolEscapeSequence: \LineTermination.

Unescaped strings are part of program text, which begin and ends by two single quotes.

 RowStringLiteral:
                   '' RowInputCharacters/opt ''

 RowInputCharacters:
                      ' (InputCharacter but not ')
                     |
                      (InputCharacter but not ') '
                     |
                      LineTermination

Methods for replacing line termination sequences in string to native format of host platform, and to well-known unix/windows formats must be added to standard library.

COMPILATION:

Handling of multiline strings:

Text within """ brackets processed in next way:

  1. splitted to sequence of lines by line termination symbols.
  2. escape sequences in each line are processed exactly as in ordinary Java strings.
  3. sequence \LineTermination at the end of line is erased and such line cause line be concatenated with next line in one.
  4. elimination of leading whitespaces are processed in next way:
  1. set of lines after erasing of leading whitespace sequence is concatenated, with LF (i. e. '\n') line-termination sequences between two neighbour lines, regardless of host system

Handling of row strings: Text within '' brackets processed in next way:

  1. splitted to sequence of lines by line termination symbols.
  2. set of lines after erasing of leading whitespace sequence is concatenated, with '\n' line-termination sequences between two neighbour lines,

No escape processing, no leading whitespace elimination are performed for receiving of resulting string value.

new strings literals created and used in .class files exactly as ordinary strings.

TESTING: add new strings literals to test-cases for all combinations of finished and unfinished escape sequences and quotes.

LIBRARY SUPPORT: It would be good add to String next methods:

s.platformLf() - returns string which replace all line-termination sequences in s by value of system property 'line.separator' s.unixLf() - returns string which replace all line-termination sequences in s by '\n' s.windowsLf() - returns string which replace all line-termination sequences in s by '\r\n'

REFLECTIVE APIS: None

OTHER CHANGES: None

MIGRATION: None

COMPABILITY None

REFERENCES

http://bugs.sun.com/view_bug.do?bug_id=4165111 http://bugs.sun.com/view_bug.do?bug_id=4472509 http://docs.google.com/View?docid=d36kv8n_32g9zj7pdd by by Jacek Furmankiewicz http://blog.efftinge.de/2008/10/multi-line-string-literals-in-java.html library implementation by Sven Efftinge http://www.jroller.com/scolebourne/entry/java_7_multi_line_string - proposal by Stephen Colebourne http://mail.openjdk.java.net/pipermail/coin-dev/2009-March/000331.html - alternative joke proposal by Felix Frost

IMPLEMENTATION URL:

Compiler changes with set of jtreg tests available from mercurial repository at http://datacenter.gradsoft.ua/mercurial/cgi/hgwebdir.cgi/jdk7/tl/langtools/



More information about the coin-dev mailing list