[Python-3000] PEP 3137: Immutable Bytes and Mutable Buffer (original) (raw)
Alexandre Vassalotti alexandre at peadrop.com
Thu Sep 27 04:36:08 CEST 2007
- Previous message: [Python-3000] PEP 3137: Immutable Bytes and Mutable Buffer
- Next message: [Python-3000] PEP 3137: Immutable Bytes and Mutable Buffer
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 9/26/07, Guido van Rossum <guido at python.org> wrote:
Constructors ------------ There are four forms of constructors, applicable to both bytes and buffer: -
bytes(<bytes>)
,bytes(<buffer>)
,buffer(<bytes>)
,buffer(<buffer>)
: simple copying constructors, with the note thatbytes(<bytes>)
might return its (immutable) argument. -bytes(<str>, <encoding>[, <errors>])
, ``buffer(, [, ])``: encode a text string. Note that thestr.encode()
method returns an immutable bytes object. The argument is mandatory; is optional. -bytes(<memory view>)
,buffer(<memory view>)
: construct a bytes or buffer object from anything that supports the PEP 3118 buffer API. -bytes(<iterable of ints>)
,buffer(<iterable of ints>)
: construct an immutable bytes or mutable buffer object from a stream of integers in range(256). -buffer(<int>)
: construct a zero-initialized buffer of a given lenth.
I think this section could be better organized. I had to read a few time to fully understand it. Maybe a table would emphasize better the differences between the two constructors.
Indexing --------
Open Issue: I'm undecided on whether indexing bytes and buffer objects should return small ints (like the bytes type in 3.0a1, and like lists or array.array('B')), or bytes/buffer objects of length 1 (like the str type). The latter (str-like) approach will ease porting code from Python 2.x; but it makes it harder to extract values from a bytes array.
I think indexing a bytes/buffer object should return an int. I find this behavior more natural, to me, than using an ord()-like function to extract values. In fact, I remarked that the use of ord() is good indicator that bytes should be used instead of str (look by yourself: grep -R --include='*.py' 'ord(' python25/Lib).
Str() and Repr() ----------------
The str() and repr() functions return the same thing for these objects. The repr() of a bytes object returns a b'...' style literal. The repr() of a buffer returns a string of the form "buffer(b'...')".
Does that mean calling str() on a bytes/buffer object -- e.g., str(b"abc") -- wouldn't decode the content of the object (like array objects)?
Bytes and the Str Type ----------------------
Like the bytes type in Python 3.0a1, and unlike the relationship between str and unicode in Python 2.x, any attempt to mix bytes (or buffer) objects and str objects without specifying an encoding will raise a TypeError exception. This is the case even for simply comparing a bytes or buffer object to a str object (even violating the general rule that comparing objects of different types for equality should just return False). Conversions between bytes or buffer objects and str objects must always be explicit, using an encoding. There are two equivalent APIs:
str(b, <encoding>[, <errors>])
is equivalent tob.encode(<encoding>[, <errors>])
, andbytes(s, <encoding>[, <errors>])
is equivalent tos.decode(<encoding>[, <errors>])
. There is one exception: we can convert from bytes (or buffer) to str without specifying an encoding by writingstr(b)
. This produces the same result asrepr(b)
. This exception is necessary because of the general promise that any object can be printed, and printing is just a special case of conversion to str. There is however no promise that printing a bytes object interprets the individual bytes as characters (unlike in Python 2.x).
Ah! That answers my last question. :)
-- Alexandre
- Previous message: [Python-3000] PEP 3137: Immutable Bytes and Mutable Buffer
- Next message: [Python-3000] PEP 3137: Immutable Bytes and Mutable Buffer
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]