Standard format specification (since C++20) (original) (raw)
For basic types and string types, the format specification is based on the format specification in Python.
The syntax of format specifications is:
| | | | | ----------------------------------------------------------------------------------------------------------------------------------------------- | | | | fill-and-align (optional) sign (optional) #(optional) 0(optional) width (optional) precision (optional) L(optional) type (optional) | | | | | | |
The sign, **#** and **0** options are only valid when an integer or floating-point presentation type is used.
Contents
- 1 Fill and align
- 2 Sign, #, and 0
- 3 Width and precision
- 4 L (locale-specific formatting)
- 5 Type
- 6 Formatting escaped characters and strings
- 7 Notes
- 8 Defect reports
[edit] Fill and align
fill-and-align is an optional fill character (which can be any character other than **{** or **}**), followed by one of the align options **<**, **>**, **^**.
If no fill character is specified, it defaults to the space character. For a format specification in a Unicode encoding, the fill character must correspond to a single Unicode scalar value.
The meaning of align options is as follows:
**<**: Forces the formatted argument to be aligned to the start of the available space by inserting n fill characters after the formatted argument. This is the default when a non-integer non-floating-point presentation type is used.**>**: Forces the formatted argument to be aligned to the end of the available space by inserting n fill characters before the formatted argument. This is the default when an integer or floating-point presentation type is used.**^**: Forces the formatted argument to be centered within the available space by inserting ⌊⌋ characters before and ⌈⌉ characters after the formatted argument.
In each case, n is the difference of the minimum field width (specified by width) and the estimated width of the formatted argument, or 0 if the difference is less than 0.
[edit] Sign, #, and 0
The sign option can be one of following:
**+**: Indicates that a sign should be used for both non-negative and negative numbers. The+sign is inserted before the output value for non-negative numbers.**-**: Indicates that a sign should be used for negative numbers only (this is the default behavior).- space: Indicates that a leading space should be used for non-negative numbers, and a minus sign for negative numbers.
Negative zero is treated as a negative number.
The sign option applies to floating-point infinity and NaN.
#include #include #include int main() { double inf = std::numeric_limits::infinity(); double nan = std::numeric_limits::quiet_NaN(); assert(std::format("{0:},{0:+},{0:-},{0: }", 1) == "1,+1,1, 1"); assert(std::format("{0:},{0:+},{0:-},{0: }", -1) == "-1,-1,-1,-1"); assert(std::format("{0:},{0:+},{0:-},{0: }", inf) == "inf,+inf,inf, inf"); assert(std::format("{0:},{0:+},{0:-},{0: }", nan) == "nan,+nan,nan, nan"); }
The **#** option causes the alternate form to be used for the conversion.
- For integral types, when binary, octal, or hexadecimal presentation type is used, the alternate form inserts the prefix (
0b,0, or0x) into the output value after the sign character (possibly space) if there is one, or add it before the output value otherwise. - For floating-point types, the alternate form causes the result of the conversion of finite values to always contain a decimal-point character, even if no digits follow it. Normally, a decimal-point character appears in the result of these conversions only if a digit follows it. In addition, for
**g**and**G**conversions, trailing zeros are not removed from the result.
The **0** option pads the field with leading zeros (following any indication of sign or base) to the field width, except when applied to an infinity or NaN. If the 0 character and an align option both appear, the 0 character is ignored.
[edit] Width and precision
width is either a positive decimal number, or a nested replacement field (**{}** or **{**n**}**). If present, it specifies the minimum field width.
precision is a dot (**.**) followed by either a non-negative decimal number or a nested replacement field. This field indicates the precision or maximum field size. It can only be used with floating-point and string types.
- For floating-point types, this field specifies the formatting precision.
- For string types, it provides an upper bound for the estimated width (see below) of the prefix of the string to be copied to the output. For a string in a Unicode encoding, the text to be copied to the output is the longest prefix of whole extended grapheme clusters whose estimated width is no greater than the precision.
If a nested replacement field is used for width or precision, and the corresponding argument is not of integral type(until C++23)standard signed or unsigned integer type(since C++23), or is negative, an exception of type std::format_error is thrown.
float pi = 3.14f; assert(std::format("{:10f}", pi) == " 3.140000"); // width = 10 assert(std::format("{:{}f}", pi, 10) == " 3.140000"); // width = 10 assert(std::format("{:.5f}", pi) == "3.14000"); // precision = 5 assert(std::format("{:.{}f}", pi, 5) == "3.14000"); // precision = 5 assert(std::format("{:10.5f}", pi) == " 3.14000"); // width = 10, precision = 5 assert(std::format("{:{}.{}f}", pi, 10, 5) == " 3.14000"); // width = 10, precision = 5 auto b1 = std::format("{:{}f}", pi, 10.0); // throws: width is not of integral type auto b2 = std::format("{:{}f}", pi, -10); // throws: width is negative auto b3 = std::format("{:.{}f}", pi, 5.0); // throws: precision is not of integral type
The width of a string is defined as the estimated number of column positions appropriate for displaying it in a terminal.
For the purpose of width computation, a string is assumed to be in an implementation-defined encoding. The method of width computation is unspecified, but for a string in a Unicode encoding, implementation should estimate the width of the string as the sum of estimated widths of the first code points in its extended grapheme clusters. The estimated width is 2 for the following code points, and is 1 otherwise:
- Any code point whose Unicode property East_Asian_Width has value Fullwidth (
F) or Wide (W) - U+4DC0 - U+4DFF (Yijing Hexagram Symbols)
- U+1F300 – U+1F5FF (Miscellaneous Symbols and Pictographs)
- U+1F900 – U+1F9FF (Supplemental Symbols and Pictographs)
[edit] L (locale-specific formatting)
The **L** option causes the locale-specific form to be used. This option is only valid for arithmetic types.
- For integral types, the locale-specific form inserts the appropriate digit group separator characters according to the context's locale.
- For floating-point types, the locale-specific form inserts the appropriate digit group and radix separator characters according to the context's locale.
- For the textual representation of
bool, the locale-specific form uses the appropriate string as if obtained with std::numpunct::truename or std::numpunct::falsename.
[edit] Type
The type option determines how the data should be presented.
The available string presentation types are:
- none,
**s**: Copies the string to the output.
| ?: Copies the escaped string (see below) to the output. | (since C++23) |
|---|
The available integer presentation types for integral types other than char, wchar_t, and bool are:
**b**: Binary format. Produces the output as if by calling std::to_chars(first, last, value, 2). The base prefix is0b.**B**: same as**b**, except that the base prefix is0B.**c**: Copies the character static_cast<CharT>(value) to the output, whereCharTis the character type of the format string. Throws std::format_error if value is not in the range of representable values forCharT.**d**: Decimal format. Produces the output as if by calling std::to_chars(first, last, value).**o**: Octal format. Produces the output as if by calling std::to_chars(first, last, value, 8). The base prefix is0if the corresponding argument value is non-zero and is empty otherwise.**x**: Hex format. Produces the output as if by calling std::to_chars(first, last, value, 16). The base prefix is0x.**X**: same as**x**, except that it uses uppercase letters for digits above 9 and the base prefix is0X.- none: same as
**d**.
The available char and wchar_t presentation types are:
- none,
**c**: Copies the character to the output. **b**,**B**,**d**,**o**,**x**,**X**: Uses integer presentation types with the value static_cast<unsigned char>(value) or static_cast<std::make_unsigned_t<wchar_t>>(value) respectively.
| ?: Copies the escaped character (see below) to the output. | (since C++23) |
|---|
The available bool presentation types are:
- none,
**s**: Copies textual representation (**true**or**false**, or the locale-specific form) to the output. **b**,**B**,**d**,**o**,**x**,**X**: Uses integer presentation types with the value static_cast<unsigned char>(value).
The available floating-point presentation types are:
**a**: If precision is specified, produces the output as if by calling std::to_chars(first, last, value, std::chars_format::hex, precision) where precision is the specified precision; otherwise, the output is produced as if by calling std::to_chars(first, last, value, std::chars_format::hex).**A**: same as**a**, except that it uses uppercase letters for digits above 9 and usesPto indicate the exponent.**e**: Produces the output as if by calling std::to_chars(first, last, value, std::chars_format::scientific, precision) where precision is the specified precision, or 6 if precision is not specified.**E**: same as**e**, except that it usesEto indicate the exponent.**f**,**F**: Produces the output as if by calling std::to_chars(first, last, value, std::chars_format::fixed, precision) where precision is the specified precision, or 6 if precision is not specified.**g**: Produces the output as if by calling std::to_chars(first, last, value, std::chars_format::general, precision) where precision is the specified precision, or 6 if precision is not specified.**G**: same as**g**, except that it usesEto indicate the exponent.- none: If precision is specified, produces the output as if by calling std::to_chars(first, last, value, std::chars_format::general, precision) where precision is the specified precision; otherwise, the output is produced as if by calling std::to_chars(first, last, value).
For lower-case presentation types, infinity and NaN are formatted as inf and nan, respectively. For upper-case presentation types, infinity and NaN are formatted as INF and NAN, respectively.
| std::format specifier | std::chars_format | corresponding std::printf specifier |
|---|---|---|
| a, A | std::chars_format::hex | a, A (but std::format does not output leading 0x or 0X) |
| e, E | std::chars_format::scientific | e, E |
| f, F | std::chars_format::fixed | f, F |
| g, G | std::chars_format::general | g, G |
| none | std::chars_format::general if precision is specified, otherwise the shortest round-trip format | g if precision is specified. Otherwise there's no corresponding specifier. |
The available pointer presentation types (also used for std::nullptr_t) are:
- none,
**p**: If std::uintptr_t is defined, produces the output as if by calling std::to_chars(first, last, reinterpret_cast<std::uintptr_t>(value), 16) with the prefix0xadded to the output; otherwise, the output is implementation-defined.
| P: same as p, except that it uses uppercase letters for digits above 9 and the base prefix is 0X. | (since C++26) |
|---|
| Formatting escaped characters and strings A character or string can be formatted as escaped to make it more suitable for debugging or for logging.Escaping is done as follows: For each well-formed code unit sequence that encodes a character C: If C is one of the characters in the following table, the corresponding escape sequence is used. Character Escape sequence Notes horizontal tab (byte 0x09 in ASCII encoding) \t line feed - new line (byte 0x0a in ASCII encoding) \n carriage return (byte 0x0d in ASCII encoding) \r double quote (byte 0x22 in ASCII encoding) \" Used only if the output is a double-quoted string single quote (byte 0x27 in ASCII encoding) \' Used only if the output is a single-quoted string backslash (byte 0x5c in ASCII encoding) \\ Otherwise, if C is not the space character (byte 0x20 in ASCII encoding), and either the associated character encoding is a Unicode encoding and C corresponds to a Unicode scalar value whose Unicode property General_Category has a value in the groups Separator (Z) or Other (C), or C is not immediately preceded by a non-escaped character, and C corresponds to a Unicode scalar value which has the Unicode property Grapheme_Extend=Yes, or the associated character encoding is not a Unicode encoding and C is one of an implementation-defined set of separator or non-printable characters the escape sequence is \u{hex-digit-sequence}, where hex-digit-sequence is the shortest hexadecimal representation of C using lower-case hexadecimal digits. Otherwise, C is copied as is. A code unit sequence that is a shift sequence has unspecified effect on the output and further decoding of the string. Other code units (i.e. those in ill-formed code unit sequences) are each replaced with \x{hex-digit-sequence}, where hex-digit-sequence is the shortest hexadecimal representation of the code unit using lower-case hexadecimal digits. The escaped string representation of a string is constructed by escaping the code unit sequences in the string, as described above, and quoting the result with double quotes.The escaped representation of a character is constructed by escaping it as described above, and quoting the result with single quotes. Compiler Explorer demo:Run this code #include int main() { std::println("[{:?}]", "h\tllo"); // prints: ["h\tllo"] std::println("[{:?}]", "Спасибо, Виктор ♥!"); // prints: ["Спасибо, Виктор ♥!"] std::println("[{:?}] [{:?}]", '\'', '"'); // prints: ['\'', '"'] // The following examples assume use of the UTF-8 encoding std::println("[{:?}]", std::string("\0 \n \t \x02 \x1b", 9)); // prints: ["\u{0} \n \t \u{2} \u{1b}"] std::println("[{:?}]", "\xc3\x28"); // invalid UTF-8 // prints: ["\x{c3}("] std::println("[{:?}]", "\u0301"); // prints: ["\u{301}"] std::println("[{:?}]", "\\\u0301"); // prints: ["\\\u{301}"] std::println("[{:?}]", "e\u0301\u0323"); // prints: ["ẹ́"] } | (since C++23) |
|---|
[edit] Notes
In most of the cases the syntax is similar to the old **%**-formatting, with the addition of the **{}** and with **:** used instead of **%**. For example, "%03.2f" can be translated to "{:03.2f}".
| Feature-test macro | Value | Std | Feature |
|---|---|---|---|
| __cpp_lib_format_uchar | 202311L | (C++20)(DR) | Formatting of code units as unsigned integers |
[edit] Defect reports
The following behavior-changing defect reports were applied retroactively to previously published C++ standards.
| DR | Applied to | Behavior as published | Correct behavior |
|---|---|---|---|
| LWG 3721 | C++20 | zero is not allowed for the width fieldin standard format specification | zero is permitted if specifiedvia a replacement field |
| P2909R4 | C++20 | char or wchar_t might be formatted asout-of-range unsigned integer values | code units are converted to the correspondingunsigned type before such formatting |