[Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] (original) (raw)

"Martin v. Löwis" martin at v.loewis.de
Tue Feb 14 07:52:13 CET 2006


Phillip J. Eby wrote:

I was just pointing out that since byte strings are bytes by definition, then simply putting those bytes in a bytes() object doesn't alter the existing encoding. So, using latin-1 when converting a string to bytes actually seems like the the One Obvious Way to do it.

This is a misconception. In Python 2.x, the type str already is a bytes type. So if S is an instance of 2.x str, bytes(S) does not need to do any conversion. You don't need to assume it is latin-1: it's already bytes.

In fact, the 'encoding' argument seems useless in the case of str objects, and it seems it should default to latin-1 for unicode objects.

I agree with the former, but not with the latter. There shouldn't be a conversion of Unicode objects to bytes at all. If you want bytes from a Unicode string U, write

bytes(U.encode(encoding))

Regards, Martin



More information about the Python-Dev mailing list