[Python-Dev] sys.setdefaultencoding() vs. csv module + unicode (original) (raw)
"Martin v. Löwis" martin at v.loewis.de
Thu Jun 14 08:47:43 CEST 2007
- Previous message: [Python-Dev] sys.setdefaultencoding() vs. csv module + unicode
- Next message: [Python-Dev] [RFC] urlparse - parse query facility
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
The csv module says it's not unicode safe but the 2.5 docs [3] have a workaround for this. While the workaround says nothing about sys.setdefaultencoding() it simply does not work with the default encoding, "ascii." Is this the problem with the csv module? Should I give up and use XML? Below is code that works vs. code that doesn't. Am I interpretting the workaround from the docs wrong?
These questions are off-topic for python-dev; please ask them on comp.lang.python instead. python-dev is for the development of Python, not for the development with Python.
kumar$ python2.5 Python 2.5 (r25:51918, Sep 19 2006, 08:49:13) [GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin
import sys, csv, codecs f = codecs.open('unicsv.csv','wb','utf-8') w = csv.writer(f) w.writerow([u'lang', u'espa\xa4ol'])
What you should do here is
def encoderow(r): return [s.encode("utf-8") for s in r])
f = open('unicsv.csv', 'wb', 'utf-8') w = csv.writer(f) w.writerow(encoderow([u'lang', u'espa\xa4ol'])
IOW, you need to encode before passing the strings to the CSV module, not afterwards.
If it is too tedious for you to put in the encoderow calls all the time, you can write a wrapper for CSV writers which transparently encodes all Unicode strings.
Regards, Martin
- Previous message: [Python-Dev] sys.setdefaultencoding() vs. csv module + unicode
- Next message: [Python-Dev] [RFC] urlparse - parse query facility
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]