[Python-Dev] transform() and untransform() methods, and the codec registry (original) (raw)
Alexander Belopolsky alexander.belopolsky at gmail.com
Tue Dec 7 06:57:43 CET 2010
- Previous message: [Python-Dev] transform() and untransform() methods, and the codec registry
- Next message: [Python-Dev] [Python-checkins] r86965 - python/branches/py3k/Lib/test/__main__.py
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Tue, Dec 7, 2010 at 12:06 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
On Tue, Dec 7, 2010 at 2:46 PM, Alexander Belopolsky <alexander.belopolsky at gmail.com> wrote:
Having all encodings accessible in a str method only promotes a programming style where bytes objects can contain differently encoded strings in different parts of the program. Instead, well-written programs should decode bytes on input, do all processing with str type and decode on output. When strings need to be passed to char* C APIs, they should be encoded in UTF-8. Many C APIs originally designed for ASCII actually produce meaningful results when given UTF-8 bytes. (Supporting such usage was one of the design goals of UTF-8.) This world sounds nice, but it isn't the one that exists right now. Practicality beats purity and all that :)
.. and default encoding being fixed as UTF-8 already goes 99% of the way to that world. As long as I can use encode/decode without an argument, it does not bother me much that they can take one. These methods are also much easier to ignore than the transform/untransform pair simply because it is only one method per class. transform/untransform have much larger mental footprint not only because there are two of them in both str and bytes, but also because both str and bytes have a synonymously named translate method. With 43 non-special methods, str interface is already huge. The transform() method with a suitable set of codecs could possibly replace things like expandtabs() or swapcase(), but that would be like writing x.transform('exp') and x.unstransform('exp') instead of math.exp(x) and math.log(x).
- Previous message: [Python-Dev] transform() and untransform() methods, and the codec registry
- Next message: [Python-Dev] [Python-checkins] r86965 - python/branches/py3k/Lib/test/__main__.py
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]