[Python-Dev] sys.setdefaultencoding() vs. csv module + unicode (original) (raw)

"Martin v. Löwis" martin at v.loewis.de
Thu Jun 14 08:47:43 CEST 2007

Previous message: [Python-Dev] sys.setdefaultencoding() vs. csv module + unicode
Next message: [Python-Dev] [RFC] urlparse - parse query facility
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

The csv module says it's not unicode safe but the 2.5 docs [3] have a workaround for this. While the workaround says nothing about sys.setdefaultencoding() it simply does not work with the default encoding, "ascii." Is this the problem with the csv module? Should I give up and use XML? Below is code that works vs. code that doesn't. Am I interpretting the workaround from the docs wrong?

These questions are off-topic for python-dev; please ask them on comp.lang.python instead. python-dev is for the development of Python, not for the development with Python.

kumar$ python2.5 Python 2.5 (r25:51918, Sep 19 2006, 08:49:13) [GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin

import sys, csv, codecs f = codecs.open('unicsv.csv','wb','utf-8') w = csv.writer(f) w.writerow([u'lang', u'espa\xa4ol'])

What you should do here is

def encoderow(r): return [s.encode("utf-8") for s in r])

f = open('unicsv.csv', 'wb', 'utf-8') w = csv.writer(f) w.writerow(encoderow([u'lang', u'espa\xa4ol'])

IOW, you need to encode before passing the strings to the CSV module, not afterwards.

If it is too tedious for you to put in the encoderow calls all the time, you can write a wrapper for CSV writers which transparently encodes all Unicode strings.

Regards, Martin

Previous message: [Python-Dev] sys.setdefaultencoding() vs. csv module + unicode
Next message: [Python-Dev] [RFC] urlparse - parse query facility
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list