[Python-Dev] Do I misunderstand how codecs.EncodedFile is supposed to work? (original) (raw)

Skip Montanaro skip@pobox.com
Tue, 6 Aug 2002 23:04:51 -0500


The following simple session suggests I misunderstood how the codecs.EncodedFile function should work:

>>> import codecs
>>> f = codecs.EncodedFile(open("unicode-test.txt", "w"), "utf-8")
>>> s = 'Caffe\x92 Lena'
>>> u = unicode(s, "cp1252")
>>> u
u'Caffe\u2019 Lena'
>>> f.write(u.encode("utf-8"))
>>> f.write(u)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/usr/local/lib/python2.3/codecs.py", line 453, in write
    data, bytesdecoded = self.decode(data, self.errors)
UnicodeError: ASCII encoding error: ordinal not in range(128)

I thought the whole purpose of the EncodedFile class was to provide transparent encoding. Shouldn't it support transparent encoding of Unicode objects? That is, I told the system I want writes to be in utf-8 when I instantiated the class. I don't think I should have to call .encode() directly. I realize I can wrap the function in a class that adds the transparency I desire, but it seems the whole point should be to make it easy to write Unicode objects to files.

Skip