Message 81048 - Python tracker (original) (raw)

FWIW, on Python3 it seems to work:

import unicodedata unicodedata.category("\U00010000") 'Lo' unicodedata.category("\U00011000") 'Cn' unicodedata.category(chr(0x10000)) 'Lo' unicodedata.category(chr(0x11000)) 'Cn' ord(chr(0x10000)), 0x10000 (65536, 65536) ord(chr(0x11000)), 0x11000 (69632, 69632)

I'm using a narrow build too:

import sys sys.maxunicode 65535 len('\U00010000') 2 ord('\U00010000') 65536

On Python2 unichr() is supposed to raise a ValueError on a narrow build if the value is greater than 0xFFFF 1, but if the characters above 0xFFFF can be represented with u"\Uxxxxxxxx" there should be a way to fix unichr so it can return them. Python3 already does it with chr().

Maybe we should open a new issue for this if it's not present already.