[Python-Dev] transform() and untransform() methods, and the codec registry (original) (raw)

Nick Coghlan ncoghlan at gmail.com
Tue Dec 7 06:06:13 CET 2010


On Tue, Dec 7, 2010 at 2:46 PM, Alexander Belopolsky <alexander.belopolsky at gmail.com> wrote:

Having all encodings accessible in a str method only promotes a programming style where bytes objects can contain differently encoded strings in different parts of the program.  Instead, well-written programs should decode bytes on input, do all processing with str type and decode on output.  When strings need to be passed to char* C APIs, they should be encoded in UTF-8.  Many C APIs originally designed for ASCII actually produce meaningful results when given  UTF-8 bytes. (Supporting such usage was one of the design goals of UTF-8.)

This world sounds nice, but it isn't the one that exists right now. Practicality beats purity and all that :)

Cheers, Nick.

-- Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia



More information about the Python-Dev mailing list