[Python-Dev] Add transform() and untranform() methods (original) (raw)

Nick Coghlan ncoghlan at gmail.com
Sat Nov 16 01:26:02 CET 2013


On 16 Nov 2013 02:36, "Antoine Pitrou" <solipsis at pitrou.net> wrote:

On Sat, 16 Nov 2013 00:46:15 +1000 Nick Coghlan <ncoghlan at gmail.com> wrote: > On 16 November 2013 00:04, Antoine Pitrou <solipsis at pitrou.net> wrote: > >> Rather than the more useful: > >> > >> >>> b"abcdef".decode("hex") > >> Traceback (most recent call last): > >> File "", line 1, in > >> TypeError: 'hex' decoder returned 'bytes' instead of 'str'; use > >> codecs.decode() to decode to arbitrary types > > > > I think this may be confusing. TypeError seems to suggest that the > > parameter type sent by the user to the method is wrong, which is not > > the actual cause of the error. > > The TypeError isn't new, Really? That's not what your message said.

The second example in my post included restoring the "hex" alias for "hex_codec" (its absence is the reason for the current "unknown encoding" error). The 3.2 and 3.3 error message for a restored alias would have been "TypeError: 'hex' decoder returned 'bytes' instead of 'str'", which I agree is confusing and uninformative - that's why I added the reference to the module level functions to the output type errors before proposing the restoration of the aliases.

So you can already use "codecs.decode(s, 'hex_codec')" in Python 3, you just won't get a useful error leading you there if you use the more common 'hex' alias instead.

To address Serhiy's security concerns with the compression codecs (which are technically independent of the question of restoring the aliases), I also plan to document how to systematically blacklist particular codecs in an application by setting attributes on the encodings module and/or appropriate entries in sys.modules.

Finally, I now plan to write a documentation PEP that suggests clearly splitting the codecs module docs into two layers: the type agnostic core infrastructure and the specific application of that infrastructure to the implementation of the text encoding model.

The only functional change I'd still like to make for 3.4 is to restore the shorthand aliases for the non-Unicode codecs (to ease the migration for folks coming from Python 2), but this thread has convinced me I likely need to write the PEP before doing that, and I still have to integrate ensurepip into pyvenv before the beta 1 deadline.

So unless you and Victor are prepared to +1 the restoration of the codec aliases (closing issue 7475) in anticipation of that codecs infrastructure documentation PEP, the change to restore the aliases probably won't be in 3.4. (I might get the PEP written in time regardless, but I'm not betting on it at this point).

Cheers, Nick.

Regards Antoine.


Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20131116/93cef44f/attachment-0001.html>



More information about the Python-Dev mailing list