[Python-Dev] bytes type discussion (original) (raw)

Adam Olsen rhamph at gmail.com
Wed Feb 15 05:41:02 CET 2006


On 2/14/06, "Martin v. Löwis" <martin at v.loewis.de> wrote:

Raymond Hettinger wrote: >>- bytes("abc") == bytes(map(ord, "abc")) > > > At first glance, this seems obvious and necessary, so if it's somewhat > controversial, then I'm missing something. What's the issue?

There is an "implicit Latin-1" assumption in that code. Suppose you do # -- coding: koi-8r -- print bytes("Гвидо ван Россум") in Python 2.x, then this means something (*). In Python 3, it gives you an exception, as the ordinals of this are suddenly above 256. Or, perhaps worse, the code # -- coding: utf-8 -- print bytes("Martin v. Löwis") will work in 2.x and 3.x, but produce different numbers (**).

My assumption is these would become errors in 3.x. bytes(str) is only needed so you can do bytes(u"abc".encode('utf-8')) and have it work in 2.x and 3.x.

(I wonder if maybe they should be an error in 2.x as well. Source encoding is for unicode literals, not str literals.)

-- Adam Olsen, aka Rhamphoryncus



More information about the Python-Dev mailing list