[text.encoding.members] (original) (raw)
28 Text processing library [text]
28.4 Text encodings identification [text.encoding]
28.4.2 Class text_encoding [text.encoding.class]
28.4.2.3 Members [text.encoding.members]
constexpr explicit text_encoding(string_view enc) noexcept;
Preconditions:
- enc represents a string in the ordinary literal encoding consisting only of elements of the basic character set ([lex.charset]).
- enc.size() <= max_name_length is true.
- enc.contains('\0') is false.
Postconditions:
- If there exists a primary name or alias aof a known registered character encoding such that_comp-name_(a, enc) is true,mib_ has the value of the enumerator of idassociated with that registered character encoding.
Otherwise, mib_ == id::other is true. - enc.compare(name_) == 0 is true.
constexpr text_encoding(id i) noexcept;
Preconditions: i has the value of one of the enumerators of id.
Postconditions:
- If (mib_ == id::unknown || mib_ == id::other)is true,strlen(name_) == 0 is true.
Otherwise,ranges::contains(aliases(), string_view(name_))is true.
constexpr id mib() const noexcept;
constexpr const char* name() const noexcept;
Remarks: name() is an ntbs and accessing elements of _name__outside of the range is undefined behavior.
constexpr aliases_view aliases() const noexcept;
Let r denote an instance of aliases_view.
If *this represents a known registered character encoding, then:
- r.front() is the primary name of the registered character encoding,
- r contains the aliases of the registered character encoding, and
- r does not contain duplicate values when compared with strcmp.
Otherwise, r is an empty range.
Each element in ris a non-null, non-empty ntbs encoded in the literal character encoding and comprising only characters from the basic character set.
[Note 1:
The order of aliases in r is unspecified.
— _end note_]
static consteval text_encoding literal() noexcept;
Mandates: CHAR_BIT == 8 is true.
Returns: A text_encoding object representing the ordinary character literal encoding ([lex.charset]).
static text_encoding environment();
Mandates: CHAR_BIT == 8 is true.
Returns: A text_encoding object representing the implementation-defined character encoding scheme of the environment.
On a POSIX implementation, this is the encoding scheme associated with the POSIX locale denoted by the empty string "".
[Note 2:
This function is not affected by calls to setlocale.
— _end note_]
Recommended practice: Implementations should return a value that is not affected by calls to the POSIX function setenv and other functions which can modify the environment ([support.runtime]).
template<id i> static bool environment_is();
Mandates: CHAR_BIT == 8 is true.
Returns: environment() == i.
static constexpr bool _comp-name_(string_view a, string_view b);
Returns: true if the two strings a and bencoded in the ordinary literal encoding are equal, ignoring, from left-to-right,
- all elements that are not digits or letters ([character.seq.general]),
- character case, and
- any sequence of one or more 0 characters not immediately preceded by a numeric prefix, where a numeric prefix is a sequence consisting of a digit in the range [1, 9] optionally followed by one or more elements which are not digits or letters,
and false otherwise.
[Note 3:
This comparison is identical to the “Charset Alias Matching” algorithm described in the Unicode Technical Standard 22[bib].
— _end note_]
[Example 1: static_assert(comp-name("UTF-8", "utf8") == true);static_assert(comp-name("u.t.f-008", "utf8") == true);static_assert(comp-name("ut8", "utf8") == false);static_assert(comp-name("utf-80", "utf8") == false); — _end example_]