[lex.ccon] (original) (raw)

5 Lexical conventions [lex]

5.13 Literals [lex.literal]

5.13.3 Character literals [lex.ccon]

encoding-prefix: one of
u8 u U L

basic-c-char:
any member of the translation character set except the U+0027 apostrophe,
U+005c reverse solidus, or new-line character

simple-escape-sequence-char: one of
' " ? \ a b f n r t v

conditional-escape-sequence-char:
any member of the basic character set that is not an octal-digit, a simple-escape-sequence-char, or the characters N, o, u, U, or x

A multicharacter literal is a character-literalwhose c-char-sequence consists of more than one c-char.

A multicharacter literal shall not have an encoding-prefix.

If a multicharacter literal contains a c-charthat is not encodable as a single code unit in the ordinary literal encoding, the program is ill-formed.

Multicharacter literals are conditionally-supported.

The kind of a character-literal, its type, and its associated character encoding ([lex.charset]) are determined by its encoding-prefix and its c-char-sequenceas defined by Table 9.

Table 9: Character literals [tab:lex.ccon.literal]

🔗 Encoding Kind Type Associated char- Example
🔗 prefix acter encoding
🔗 none ordinary character literal char ordinary literal 'v'
🔗 multicharacter literal int encoding 'abcd'
🔗 L wide character literal wchar_t wide literal L'w'
🔗 encoding
🔗 u8 UTF-8 character literal char8_t UTF-8 u8'x'
🔗 u UTF-16 character literal char16_t UTF-16 u'y'
🔗 U UTF-32 character literal char32_t UTF-32 U'z'

In translation phase 4, the value of a character-literal is determined using the range of representable values of the character-literal's type in translation phase 7.

A multicharacter literal has animplementation-defined value.

The value of any other kind of character-literalis determined as follows:

The character specified by a simple-escape-sequenceis specified in Table 10.

[Note 1:

Using an escape sequence for a question mark is supported for compatibility with C++ 2014 and C.

— _end note_]

Table 10: Simple escape sequences [tab:lex.ccon.esc]

🔗 character simple-escape-sequence
🔗 U+000a line feed \n
🔗 U+0009 character tabulation \t
🔗 U+000b line tabulation \v
🔗 U+0008 backspace \b
🔗 U+000d carriage return \r
🔗 U+000c form feed \f
🔗 U+0007 alert \a
🔗 U+005c reverse solidus \\
🔗 U+003f question mark \?
🔗 U+0027 apostrophe \'
🔗 U+0022 quotation mark \"