[Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] (original) (raw)

Guido van Rossum guido at python.org
Tue Feb 14 05:07:49 CET 2006


On 2/13/06, Neil Schemenauer <nas at arctrix.com> wrote:

Guido van Rossum <guido at python.org> wrote: >> In py3k, when the str object is eliminated, then what do you have? >> Perhaps >> - bytes("\x80"), you get an error, encoding is required. There is no >> such thing as "default encoding" anymore, as there's no str object. >> - bytes("\x80", encoding="latin-1"), you get a bytestring with a >> single byte of value 0x80. > > Yes to both again.

I haven't been following this dicussion about bytes() real closely but I don't think that bytes() should do the encoding. We already have a way to spell that: "\x80".encode('latin-1')

But in 2.5 we can't change that to return a bytes object without creating HUGE incompatibilities.

In general I've come to appreciate that there are two ways of converting an object of type A to an object of type B: ask an A instance to convert itself to a B, or ask the type B to create a new instance from an A. Depending on what A and B are, both APIs make sense; sometimes reasons of decoupling require that A can't know about B, in which case you have to use the latter approach; sometimes B can't know about A, in which case you have to use the former. Even when A == B we sometimes support both APIs: to create a new list from a list a, you can write a[:] or list(a); to create a new dict from a dict d, you can write d.copy() or dict(d).

An advantage of the latter API is that there's no confusion about the resulting type -- dict(d) is definitely a dict, and list(a) is definitely a list. Not so for d.copy() or a[:] -- if the input type is another mapping or sequence, it'll probably return an object of that same type.

Again, it depends on the application which is better.

I think that bytes(s, ) is fine, especially for expressing a new type, since it is unambiguous about the result type, and has no backwards compatibility issues.

Also, I think it would useful to introduce byte array literals at the same time as the bytes object. That would allow people to use byte arrays without having to get involved with all the silly string encoding confusion.

You missed the part where I said that introducing the bytes type without a literal seems to be a good first step. A new type, even built-in, is much less drastic than a new literal (which requires lexer and parser support in addition to everything else).

-- --Guido van Rossum (home page: http://www.python.org/~guido/)



More information about the Python-Dev mailing list