[lex.universal.char] (original) (raw)

5 Lexical conventions [lex]

5.3 Characters [lex.char]

5.3.2 Universal character names [lex.universal.char]

n-char:
any member of the translation character set except the U+007d right curly bracket or new-line character

The universal-character-name construct provides a way to name any element in the translation character set using just the basic character set.

A universal-character-nameof the form \u hex-quad,\U hex-quad hex-quad, or\u{simple-hexadecimal-digit-sequence}designates the character in the translation character set whose Unicode scalar value is the hexadecimal number represented by the sequence of hexadecimal-digit_s_in the universal-character-name.

The program is ill-formed if that number is not a Unicode scalar value.

A universal-character-namethat is a named-universal-characterdesignates the corresponding character in the Unicode Standard (chapter 4.8 Name) if the n-char-sequence is equal to its character name or to one of its character name aliases of type “control”, “correction”, or “alternate”; otherwise, the program is ill-formed.

[Note 2:

These aliases are listed in the Unicode Character Database's NameAliases.txt.

None of these names or aliases have leading or trailing spaces.

— _end note_]