[Python-Dev] len(chr(i)) = 2? (original) (raw)

Stephen J. Turnbull stephen at xemacs.org
Mon Nov 22 11:47:09 CET 2010


"Martin v. Löwis" writes:

More interestingly (and to the subject) is chr: how did you arrive at C9 banning Python3's definition of chr? This chr function puts the code sequence into well-formed UTF-16; that's the whole point of UTF-16.

No, it doesn't, in the specific case of surrogate code points. In 3.1.2 from MacPorts on a iBook G4 and from Gentoo on AMD64, chr(0xd800) returns "\ud800".

I don't know if that's by design (eg, so that it can be used in the implementation of the surrogateescape error handler) or a correctable oversight, but it's not conformant.



More information about the Python-Dev mailing list