[Python-Dev] Re: Multibyte repr() (original) (raw)

Guido van Rossum guido@python.org
Wed, 09 Oct 2002 15:04:08 -0400


I told this all to Tim, and he had one comment. The repr() function of an 8-bit string can now return characters with the high bit set. This was the direct cause of the failures. It was introduced in the following patch:

---------------------------- revision 2.190 date: 2002/10/07 13:55:50; author: loewis; state: Exp; lines: +68 -15 Patch #479898: Use multibyte C library for printing strings if available. ---------------------------- Was this really a good idea???

Here's an example of what I mean.

Python 2.2:

u = u'\u1f40' s = u.encode('utf8') s '\xe1\xbd\x80'

Python 2.3:

u = u'\u1f40' s = u.encode('utf8') s '�\x80'

The latter output is not helpful, because the encoding of s is not the locale's encoding.

--Guido van Rossum (home page: http://www.python.org/~guido/)