[Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] (original) (raw)

Ian Bicking ianb at colorstudy.com
Sat Feb 18 00:13:51 CET 2006


Martin v. Löwis wrote:

Users do

py> "Martin v. Löwis".encode("utf-8") Traceback (most recent call last): File "", line 1, in ? UnicodeDecodeError: 'ascii' codec can't decode byte 0xf6 in position 11: ordinal not in range(128) because they want to convert the string "to Unicode", and they have found a text telling them that .encode("utf-8") is a reasonable method. What it should tell them is py> "Martin v. Löwis".encode("utf-8") Traceback (most recent call last): File "", line 1, in ? AttributeError: 'str' object has no attribute 'encode'

I think it would be even better if they got "ValueError: utf8 can only encode unicode objects". AttributeError is not much more clear than the UnicodeDecodeError.

That str.encode(unicode_encoding) implicitly decodes strings seems like a flaw in the unicode encodings, quite seperate from the existance of str.encode. I for one really like s.encode('zlib').encode('base64') -- and if the zlib encoding raised an error when it was passed a unicode object (instead of implicitly encoding the string with the ascii encoding) that would be fine.

The pipe-like nature of .encode and .decode works very nicely for certain transformations, applicable to both unicode and byte objects. Let's not throw the baby out with the bath water.

-- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org



More information about the Python-Dev mailing list