[Python-Dev] PEP 467: last round (?) (original) (raw)

Ethan Furman ethan at stoneleaf.us
Mon Sep 5 12:58:42 EDT 2016


On 09/03/2016 09:48 AM, Nick Coghlan wrote:

On 3 September 2016 at 21:35, Martin Panter wrote:

On 3 September 2016 at 08:47, Victor Stinner wrote:

Le samedi 3 septembre 2016, Random832 a écrit :

On Fri, Sep 2, 2016, at 19:44, Ethan Furman wrote:

The problem with only having bchr is that it doesn't help with bytearray;

What is the use case for bytearray.fromord? Even in the rare case someone needs it, why not bytearray(bchr(...))? Yes, this was my point: I don't think that we need a bytearray method to create a mutable string from a single byte. I agree with the above. Having an easy way to turn an int into a bytes object is good. But I think the built-in bchr() function on its own is enough. Just like we have bytes object literals, but the closest we have for a bytearray literal is bytearray(b". . ."). This is a good point - earlier versions of the PEP didn't include bchr(), they just had the class methods, so "bytearray(bchr(...))" wasn't an available spelling (if I remember the original API design correctly, it would have been something like "bytearray(bytes.byte(...))"), which meant there was a strong consistency argument in having the alternate constructor on both types. Now that the PEP proposes the "bchr" builtin, the "fromord" constructors look less necessary.

tl;dr -- Sounds good to me. I'll update the PEP.


When this started the idea behind the methods that eventually came to be called "fromord" and "fromsize" was that they would be the two possible interpretations of "bytes(x)":

the legacy Python2 behavior:

 >>> var = bytes('abc')
 >>> bytes(var[1])
 'b'

the current Python 3 behavior:

 >>> var = b'abc'
 >>> bytes(var[1])
 b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
   \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
   \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
   \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
   \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
   \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
   \x00\x00'

Digging deeper the problem turns out to be that indexing a bytes object changed:

Python 2:

 >>> b'abc'[1]
 'b'

Python 3:

 >>> b'abc'[1]
 98

If we pass an actual byte into the Python 3 bytes constructor it behaves as one would expect:

 >>> bytes(b'b')
 b'b'

Given all this it can be argued that the real problem is that indexing a bytes object behaves differently depending on whether you retrieve a single byte with an index versus a single byte with a slice:

 >>> b'abc'[2]
 99

 >>> b'abc'[2:]
 b'c'

Since we cannot fix that behavior, the question is how do we make it more livable?

Which is all to say we have two problems to deal with:

Since "bytes.fromint()" and "bchr()" are the same, and given that "bchr(ordinal)" mirrors "chr(ordinal)", I think "bchr" is the better choice for getting bytes from an int.

For getting bytes from bytes, "getbyte()" and "iterbytes" are good choices.

Given that, and the uncertain deprecation time frame for accepting integers in the main bytes and bytearray constructors, perhaps both the "fromsize" and "fromord" parts of the proposal can be deferred indefinitely in favour of just adding the bchr() builtin?

Agreed.

-- Ethan



More information about the Python-Dev mailing list