[Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces (original) (raw)

"Martin v. Löwis" martin at v.loewis.de
Sat Apr 25 14:44:49 CEST 2009


If the bytes are mapped to single half surrogate codes instead of the normal pairs (low+high), then I can see that decoding could never be ambiguous and encoding could produce the original bytes.

I was confused by Markus Kuhn's original UTF-8b specification. I have now changed the PEP to avoid using PUA characters at all.

Regards, Martin



More information about the Python-Dev mailing list