[Python-Dev] PEP 393 Summer of Code Project (original) (raw)

"Martin v. Löwis" martin at v.loewis.de
Wed Aug 24 19:54:06 CEST 2011


Eg, display of characters in the interpreter. I don't know why you say it's "done in terms of UTF-16", then. Unicode strings are simply encoded to whatever character set is detected as the terminal's character set.

I think what he means (and what I meant when I said something similar): I/O will consider surrogate pairs in the representation when converting to the output encoding. This is actually relevant only for UTF-8 (I think), which converts surrogate pairs "correctly". This can be taken as a proof that Python 3.2 is "UTF-16 aware" (in some places, but not in others).

With Python's I/O architecture, it is of course not actually the I/O which considers UTF-16, but the codec.

Regards, Martin



More information about the Python-Dev mailing list