Digit Separators (original) (raw)
ISO/IEC JTC1 SC22 WG21 N3661 - 2013-04-19
Lawrence Crowl, crowl@google.com, Lawrence@Crowl.org
Problem
Solution
Proposal
2.14.2 Integer literals [lex.icon]
2.14.4 Floating literals [lex.fcon]
2.14.8 User-defined literals [lex.ext]
C.new.new Clause 2: lexical conventions [diff.cpp11.lex]
Problem
Numeric literals of more than a few digits are hard to read. Consider the following tasks.
- Pronounce
7237498123
. - Compare
237498123
with237499123
for equality. - Decide whether
237499123
or20249472
is larger.
Solution
The problem has a long history of solutions in writing and typography, digit separators. In the English-speaking world, commas are usually used to separate digits.
- Pronounce
7,237,498,123
. - Compare
237,498,123
with237,499,123
for equality. - Decide whether
237,499,123
or20,249,472
is larger.
We wish to introduce digit separators into C++. Much discussion of constraints and alternatives appears in N3499. We propose using an underscore (aka low line) as a digit separator and a double radix point (aka double dot) as a disambiguating suffix separator.
Proposal
2.14.2 Integer literals [lex.icon]
Edit the grammar as follows. Editor, note the change to the binary literal syntax as described in N3472.
integer-literal:
decimal-literal integer-suffixopt
octal-literal integer-suffixopt
hexadecimal-literal integer-suffixopt
decimal-literal:
nonzero-digit
decimal-literal digit-separatoroptdigit
octal-literal:
0
octal-literal digit-separatoroptoctal-digit
hexadecimal-literal:
0x
hexadecimal-digit
0X
hexadecimal-digithexadecimal-literal digit-separatoropthexadecimal-digit
binary-literal:
0b
binary-digit
0b
binary-digithexadecimal-literal digit-separatoroptbinary-digit
nonzero-digit: one of
1 2 3 4 5 6 7 8 9
octal-digit: one of
0 1 2 3 4 5 6 7
hexadecimal-digit: one of
0 1 2 3 4 5 6 7 8 9
a b c d e f
A B C D E F
digit-separator:
_
Edit paragraph 1 as follows.
An integer literalis a sequence of digits that has no period or exponent part, with optional digit separators. These separators are ignored when determining its value. .... [Example:
theThe number twelve can be written12
,014
, or0XC
.The literals1048576
,1_048_576
,0X100000
,0x10_0000
, and0_004_000_000
all have the same value.—_end example_]
2.14.4 Floating literals [lex.fcon]
Edit the grammar as follows.
floating-literal:
fractional-constant exponent-partoptfloating-suffixopt
digit-sequence exponent-part floating-suffixopt
fractional-constant:
digit-sequenceopt
.
digit-sequencedigit-sequence
.
exponent-part:
e
signopt digit-sequence
E
signopt digit-sequencesign: one of
+ -
digit-sequence:
digit
digit-sequence digit-separatoroptdigit
Edit within paragraph 1 as follows.
.... The integer and fraction parts both consist of a sequence of decimal (base ten) digits, with optional digit separators.These separators are ignored when determining the value. [_Example:_The literals
1.602_176_565e-19
and1.602176565e-19
have the same value. —_end example_]....
2.14.8 User-defined literals [lex.ext]
Edit the grammar as follows. Editor, note the change to the binary literal syntax as described in N3472.
user-defined-literal:
user-defined-integer-literal
user-defined-floating-literal
user-defined-string-literal
user-defined-character-literal
user-defined-integer-literal:
decimal-literal
ud-suffixseparated-suffixoctal-literal
ud-suffixseparated-suffixhexadecimal-literal
ud-suffixseparated-suffixbinary-literal
ud-suffixseparated-suffixuser-defined-floating-literal:
fractional-constant exponent-partopt
ud-suffixseparated-suffixdigit-sequence exponent-part
ud-suffixseparated-suffixuser-defined-string-literal:
string-literal ud-suffix
user-defined-character-literal:
character-literal ud-suffix
separated-suffix:
suffix-separatoropt ud-suffix
suffix-separator:
..
ud-suffix:
identifier
Edit paragraph 1 as follows.
If a token matches both user-defined-literaland another literal kind, it is treated as the latter. [Example:
123_km
and123.._km
is a user-defined-literalare user-defined-literals, but 123_456 and 12LLis an integer-literalare integer-literals. —_end example_] The syntactic non-terminal preceding the ud-suffix or separated-suffixin a user-defined-literalis taken to be the longest sequence of characters that could match that non-terminal.
C.new.new Clause 2: lexical conventions [diff.cpp11.lex]
Add a new section as follows. Editor: please incorporate with N3652.
Add the new text block below.
2.14 [lex.literal]
**Change:**Digit separator support.
**Rationale:**Required for new features.
**Effect on original feature:**Valid C++ 2011 code may change meaning, and hence possibly fail to compile, in this International Standard. A user-defined literal suffix that begins with an underscore followed by a character that may be interpreted as a digit within the context of the enclosing literal may change meaning. For example,
10_10
changes from integer10
with a suffix of_10
to an integer1010
. The original meaning can be restored with10.._10
. The literal0x1234_goo
has suffix_goo
but the literal0x1234_foo
has suffixoo
. The0x1234.._foo
has suffix_foo
.