<regex>: regex_traits::transform_primary should yield primary sort keys appropriate for the imbued locale by muellerj2 · Pull Request #5444 · microsoft/STL (original) (raw)

Fixes #5435. Fixes #5291.

The actual work is done in two new functions __std_regex_transform_primary_char/wchar_t, which are basically 1:1 copies of _Strxfrm() and _Wcsxfrm() but pass different flags to __crtLCMapStringA/W. I also took the liberty to correct the SAL annotations.

__crtLCMapStringA/W are declared in awint.hpp which includes yvals.h. I'm uncertain if this is the best approach, but I undefined _ENFORCE_ONLY_CORE_HEADERS so that awint.hpp can be included.

transform_primary has to check the types of the collate facets using RTTI, so I made the function always returns an empty string when dynamic RTTI is disabled/_CPPRTTI is undefined. The implementation itself is heavily based on collate::do_transform (including the change in #5431). It also needs access to the internals of collate, so I made _Regex_traits a friend of it.

There is a behavior change for the C locale: As I explained in more detail in #5435, the traits requirement in [re.req]/20 is actually misleading, since it is wrong for precisely one locale: the C locale (or the POSIX locale, see the collation order definition here: https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap07.html#tag_07_03_02_06). Since the equivalence classes are derived from POSIX and the definition of regex_traits::transform_primary also alludes to "primary sort keys" which indirectly reference terminology from the POSIX standard (https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap07.html#tag_07_03_02), I think we should do as POSIX says: "A" should not match [[=a=]].

This has consequences:

Since matching and parsing of equivalences no longer go through collate::transform, related tests no longer have to be skipped under IDL mismatch.