PEP 263 phase 2 implementation (Re: [Python-Dev] PEP 263 considered faulty (for some Japanese)) (original) (raw)

SUZUKI Hisao suzuki611@oki.com
Tue, 26 Mar 2002 09:02:15 +0900


> N.B. one should write a binary (not character, but, say, image > or audio) data literal as follows: > > b = '\x89\xAB\xCD\xEF'

I completely agree. Binary data should use hex escapes. That will make an interesting challenge for any stage 2 implementation, BTW: \xAB shall denote byte 0x89 no matter what the input encoding was. So you cannot convert \xAB to a Unicode character, and expect conversion to the input encoding to do the right thing. Instead, you must apply the conversion to the source encoding only for the unescaped characters.

Note that it is not a challenge for my implementation at all. You can use your binary strings as they are at present. Please try it.

People had been proposing to introduce b'' strings for binary data, to allow to switch 'plain' strings to denote Unicode strings at some point, but this is a different PEP.

I think you need not introduce b'' strings at all; you can keep it simple as it is.

-- SUZUKI Hisao <suzuki@acm.org> <suzuki611@oki.com>