[Python-Dev] bytes type discussion (original) (raw)

"Martin v. Löwis" martin at v.loewis.de
Wed Feb 15 02:11:24 CET 2006


Raymond Hettinger wrote:

- bytes("abc") == bytes(map(ord, "abc"))

At first glance, this seems obvious and necessary, so if it's somewhat controversial, then I'm missing something. What's the issue?

There is an "implicit Latin-1" assumption in that code. Suppose you do

-- coding: koi-8r --

print bytes("Гвидо ван Россум")

in Python 2.x, then this means something (*). In Python 3, it gives you an exception, as the ordinals of this are suddenly above 256.

Or, perhaps worse, the code

-- coding: utf-8 --

print bytes("Martin v. Löwis")

will work in 2.x and 3.x, but produce different numbers (**).

Regards, Martin

(*) [231, 215, 201, 196, 207, 32, 215, 193, 206, 32, 242, 207, 211, 211, 213, 205]

(**) In 2.x, this will give [77, 97, 114, 116, 105, 110, 32, 118, 46, 32, 76, 195, 182, 119, 105, 115] whereas in 3.x, it will give [77, 97, 114, 116, 105, 110, 32, 118, 46, 32, 76, 246, 119, 105, 115]



More information about the Python-Dev mailing list