[Python-Dev] Re: adding a bytes sequence type to Python (original) (raw)
James Y Knight foom at fuhm.net
Wed Aug 18 00:04:45 CEST 2004
- Previous message: [Python-Dev] Re: adding a bytes sequence type to Python
- Next message: [Python-Dev] Re: adding a bytes sequence type to Python
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Aug 17, 2004, at 5:18 PM, Bob Ippolito wrote:
On Aug 17, 2004, at 5:11 PM, Martin v. Löwis wrote:
Bob Ippolito wrote: How would you embed raw bytes if the string was unicode?
The most direct notation would be bytes("delimited packet\x00") However, people might not understand what is happening, and Guido doesn't like it if the bytes are >127. I guess that was a bad example, what if the delimiter was \xff?
Indeed, if all strings are unicode, the question becomes: what encoding does bytes() use to translate unicode characters to bytes. Two alternatives have been proposed so far:
- ASCII (translate chars as their codepoint if < 128, else error)
- ISO-8859-1 (translate chars as their codepoint if < 256, else error)
I think I'd choose #2, myself.
I know that map(ord, u'delimited packet\xff') would get correct results.. but I don't think I like that either.
Why would you consider that wrong? ord(u'\xff') should return 255. Just as ord(u'\u1000') returns 4096. There's nothing mysterious there.
James
- Previous message: [Python-Dev] Re: adding a bytes sequence type to Python
- Next message: [Python-Dev] Re: adding a bytes sequence type to Python
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]