[Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] (original) (raw)
Guido van Rossum guido at python.org
Wed Feb 15 00:13:37 CET 2006
- Previous message: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]
- Next message: [Python-Dev] byte literals unnecessary [Was: PEP 332 revival in coordination with pep 349?]
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 2/14/06, Neil Schemenauer <nas at arctrix.com> wrote:
People could spell it bytes(s.encode('latin-1')) in order to make it work in 2.X. That spelling would provide a way of ensuring the type of the return value.
At the cost of an extra copying step.
[Guido]
> You missed the part where I said that introducing the bytes type > without a literal seems to be a good first step. A new type, even > built-in, is much less drastic than a new literal (which requires > lexer and parser support in addition to everything else).
Are you concerned about the implementation effort? If so, I don't think that's justified since adding a new string prefix should be pretty straightforward (relative to rest of the effort involved).
Not so much the implementation but also the documentation, updating 3rd party Python preprocessors, etc.
Are you comfortable with the proposed syntax?
Not entirely, since I don't know what b"abcdef" would mean (where is a Unicode Euro character typed in whatever source encoding was used).
Instead of b"abc" (only ASCII) you could write bytes("abc"). Instead of b"\xf0\xff\xee" you could write bytes([0xf0, 0xff, 0xee]).
The key disconnect for me is that if bytes are not characters, we shouldn't use a literal notation that resembles the literal notation for characters. And there's growing consensus that a bytes type should be considered as an array of (8-bit unsigned) ints.
Also, bytes objects are (in my mind anyway) mutable. We have no other literal notation for mutable objects. What would the following code print?
for i in range(2): b = b"abc" print b b[0] = ord("A")
Would the second output line print abc or Abc?
I guess the only answer that makes sense is that it should print abc both times; but that means that b"abc" must be internally implemented by creating a new bytes object each time. Perhaps the implementation effort isn't so minimal after all...
(PS why is there a reply-to in your email the excludes you from the list of recipients but includes me?)
-- --Guido van Rossum (home page: http://www.python.org/~guido/)
- Previous message: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]
- Next message: [Python-Dev] byte literals unnecessary [Was: PEP 332 revival in coordination with pep 349?]
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]