[Python-Dev] PEP 393 Summer of Code Project (original) (raw)
"Martin v. Löwis" martin at v.loewis.de
Thu Aug 25 09:50:08 CEST 2011
- Previous message: [Python-Dev] PEP 393 Summer of Code Project
- Next message: [Python-Dev] PEP 393 Summer of Code Project
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
What about things like the surrogateescape codec that deliberately use code units in non-standard ways? Will tricks like that still be possible if the code-unit level is hidden from the programmer?
Most certainly. In the PEP-393 representation, the surrogate characters can readily be represented (and would imply atleast the two-byte form), but they will never take their UTF-16 function (i.e. the UTF-8 codec won't try to combine surrogate pairs), so they can be used for surrogateescape and other functions. Of course, in strict error mode, codecs will refuse to encode them (notice that surrogateescape is an error handler, not a codec).
Regards, Martin
- Previous message: [Python-Dev] PEP 393 Summer of Code Project
- Next message: [Python-Dev] PEP 393 Summer of Code Project
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]