[Python-Dev] transform() and untransform() methods, and the codec registry (original) (raw)

Guido van Rossum guido at python.org
Thu Dec 9 19:42:27 CET 2010

Previous message: [Python-Dev] transform() and untransform() methods, and the codec registry
Next message: [Python-Dev] transform() and untransform() methods, and the codec registry
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Mon, Dec 6, 2010 at 3:39 AM, M.-A. Lemburg <mal at egenix.com> wrote:

Guido van Rossum wrote:

The moratorium is intended to freeze the state of the language as implemented, not whatever was discussed and approved but didn't get implemented (that'd be a hole big enough to drive a truck through, as the saying goes :-). Sure, but those two particular methods only provide interfaces to the codecs sub-system without actually requiring any major implementation changes. Furthermore, they "help ease adoption of Python 3.x" (quoted from PEP 3003), since the functionality they add back was removed from Python 3.0 in a way that makes it difficult to port Python2 applications to Python3.

Regardless of what I or others may have said before, I am not currently a fan of adding transform() to either str or bytes. How should I read this ? Do want the methods to be removed again and added back in 3.3 ?

Given that it's in 3.2b1 I'm okay with keeping it. That's at best a +0. I'd be -0 if it wasn't already in. But anyway this should suffice to keep it in unless there are others strongly opposed.

Frankly, I'm a bit tired of constantly having to argue against cutting down the Unicode and codec support in Python3.

But transform() isn't really about Unicode or codec support -- it is about string-to-string and bytes-to-bytes transformations. At least the transform() API is clear about the distinction between codecs (which translate between bytes and string) and transforms (which keep the type unchanged) -- though I still don't like that the registries for transforms and codecs use the same namespace. Also bytes-bytes and string-string transforms use the same namespace even though the typical transform only supports one or the other. E.g. IMO all of the following should raise LookupError:

b'abc'.transform('rot13') Traceback (most recent call last): File "", line 1, in File "/Users/guido/p3/Lib/encodings/rot_13.py", line 16, in encode return (input.translate(rot13_map), len(input)) TypeError: expected an object with the buffer interface

b'abc'.decode('rot13') Traceback (most recent call last): File "", line 1, in File "/Users/guido/p3/Lib/encodings/rot_13.py", line 19, in decode return (input.translate(rot13_map), len(input)) AttributeError: 'memoryview' object has no attribute 'translate'

'abc'.encode('rot13') Traceback (most recent call last): File "", line 1, in TypeError: encoder did not return a bytes object (type=str)

b''.decode('rot13') ''

The latter may be a separate bug; b''.decode('anything') seems to not even attempt to look up the codec.

-- --Guido van Rossum (python.org/~guido)

Previous message: [Python-Dev] transform() and untransform() methods, and the codec registry
Next message: [Python-Dev] transform() and untransform() methods, and the codec registry
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list