Issue 36486: Bugs and inconsistencies in unicodedata (original) (raw)
In unicodedata
, the functions lookup
and name
have some bugs and inconsistencies.
lookup
matches case-insensitively, except for the algorithmic names of Hangul syllables and CJK unified ideographs, which must be in all caps. The documentation does not explain how character names are fuzzily matched.
lookup
accepts names like “CJK UNIFIED IDEOGRAPH-04E00”, where the code point has a leading zero.
lookup
and name
don’t implement rule NR2, defined in chapter 4 of Unicode, for Tangut ideographs’ names.