[format.string.escaped] (original) (raw)
28 Text processing library [text]
28.5 Formatting [format]
28.5.6 Formatter [format.formatter]
28.5.6.5 Formatting escaped characters and strings [format.string.escaped]
A character or string can be formatted as escapedto make it more suitable for debugging or for logging.
The escaped string E representation of a string _S_is constructed by encoding a sequence of characters as follows.
The associated character encoding _CE_for charT (Table 12) is used to both interpret S and construct E.
- U+0022 quotation mark (") is appended to E.
- For each code unit sequence X in S that either encodes a single character, is a shift sequence, or is a sequence of ill-formed code units, processing is in order as follows:
- If X encodes a single character C, then:
* If C is one of the characters in Table 112, then the two characters shown as the corresponding escape sequence are appended to E.
* Otherwise, if C is not U+0020 space and
* CE is UTF-8, UTF-16, or UTF-32 and_C_ corresponds to a Unicode scalar value whose Unicode property General_Category has a value in the groupsSeparator (Z) or Other (C), as described by UAX #44 of the Unicode Standard, or
* CE is UTF-8, UTF-16, or UTF-32 and_C_ corresponds to a Unicode scalar value with the Unicode property Grapheme_Extend=Yesas described by UAX #44 of the Unicode Standard and_C_ is not immediately preceded in S by a character P appended to E_without translation to an escape sequence, or
* CE is neither UTF-8, UTF-16, nor UTF-32 and_C is one of an implementation-defined set of separator or non-printable characters
then the sequence \u{hex-digit-sequence}is appended to E, where _hex-digit-sequence_is the shortest hexadecimal representation of C using lower-case hexadecimal digits.
* Otherwise, C is appended to E. - Otherwise, if X is a shift sequence, the effect on E and further decoding of _S_is unspecified.
Recommended practice: A shift sequence should be represented in _E_such that the original code unit sequence of _S_can be reconstructed. - Otherwise (X is a sequence of ill-formed code units), each code unit U is appended to E in order as the sequence \x{hex-digit-sequence}, where _hex-digit-sequence_is the shortest hexadecimal representation of _U_using lower-case hexadecimal digits.
- If X encodes a single character C, then:
- Finally, U+0022 quotation mark (") is appended to E.
Table 112 β Mapping of characters to escape sequences [tab:format.escape.sequences]
πCharacter | Escape sequence |
---|---|
πU+0009 character tabulation | \t |
πU+000a line feed | \n |
πU+000d carriage return | \r |
πU+0022 quotation mark | \" |
πU+005c reverse solidus | \\ |
The escaped string representation of a character _C_is equivalent to the escaped string representation of a string of C, except that:
- the result starts and ends with U+0027 apostrophe (') instead of U+0022 quotation mark ("), and
- if C is U+0027 apostrophe, the two characters \' are appended to E, and
- if C is U+0022 quotation mark, then C is appended unchanged.
[Example 1: string s0 = format("[{}]", "h\tllo"); string s1 = format("[{:?}]", "h\tllo"); string s2 = format("[{:?}]", "Π‘ΠΏΠ°ΡΠΈΠ±ΠΎ, ΠΠΈΠΊΡΠΎΡ β₯!"); string s3 = format("[{:?}, {:?}]", '\'', '"'); string s4 = format("[{:?}]", string("\0 \n \t \x02 \x1b", 9)); string s5 = format("[{:?}]", "\xc3\x28"); string s6 = format("[{:?}]", "π€·π»ββοΈ"); string s7 = format("[{:?}]", "\u0301"); string s8 = format("[{:?}]", "\\\u0301"); string s9 = format("[{:?}]", "e\u0301\u0323"); β _end example_]