Message 117046 - Python tracker (original) (raw)

I use Python 3, where len("\U00010337") == 2 on a narrow build.

Yes, wide Unicode on a narrow build is a problem:

regex.findall("\U00010337", "a\U00010337bc") [] regex.findall("(?i)\U00010337", "a\U00010337bc") []

I'm not sure how (or whether!) to handle surrogate pairs. It would make things more complicated.

I suppose the moral is that if you want to use wide Unicode then you really should use a wide build.