Message 117046 - Python tracker (original) (raw)
I use Python 3, where len("\U00010337") == 2 on a narrow build.
Yes, wide Unicode on a narrow build is a problem:
regex.findall("\U00010337", "a\U00010337bc") [] regex.findall("(?i)\U00010337", "a\U00010337bc") []
I'm not sure how (or whether!) to handle surrogate pairs. It would make things more complicated.
I suppose the moral is that if you want to use wide Unicode then you really should use a wide build.