BUG/API: can't pass parameters to csv module via df.to_csv · Issue #4528 · pandas-dev/pandas (original) (raw)

Trying to print a data frame as plain, strict tsv (i.e., no quoting and no escaping, because I know none the fields will contain tabs), I wanted to use the "quoting" option, which is documented in pandas and is passed through to csv, as well as the "quotechar" option, not documented in pandas but also a csv option. But it doesn't work:

In [1]: import sys, csv

In [2]: from pandas import DataFrame

In [3]: data = {'col1': ['contents of col1 row1', 'contents " of col1 row2'], 'col2': ['contents of col2 row1', 'contents " of col2 row2'] }

In [4]: df = DataFrame(data)

In [5]: df.to_csv(sys.stdout, sep='\t', quoting=csv.QUOTE_NONE, quotechar=None)         col1    col2 0       contents of col1 row1   contents of col2 row1

Error                                     Traceback (most recent call last) in () ----> 1 df.to_csv(sys.stdout, sep='\t', quoting=csv.QUOTE_NONE, quotechar=None)

/home/brechea/.local/lib/python2.6/site-packages/pandas-0.12.0-py2.6-linux-x86_64.egg/pandas/core/frame.pyc in to_csv(self, path_or_buf, sep, na_rep, float_format, cols, header, index, index_label, mode, nanRep, encoding, quoting, line_terminator, chunksize, tupleize_cols, **kwds)    1409                                      tupleize_cols=tupleize_cols,    1410                                      ) -> 1411         formatter.save()    1412    1413     def to_excel(self, excel_writer, sheet_name='sheet1', na_rep='',

/home/brechea/.local/lib/python2.6/site-packages/pandas-0.12.0-py2.6-linux-x86_64.egg/pandas/core/format.pyc in save(self)     974     975             else: --> 976                 self._save()     977     978

/home/brechea/.local/lib/python2.6/site-packages/pandas-0.12.0-py2.6-linux-x86_64.egg/pandas/core/format.pyc in _save(self)    1080                 break    1081 -> 1082             self._save_chunk(start_i, end_i)    1083    1084     def _save_chunk(self, start_i, end_i):

/home/brechea/.local/lib/python2.6/site-packages/pandas-0.12.0-py2.6-linux-x86_64.egg/pandas/core/format.pyc in _save_chunk(self, start_i, end_i)    1098         ix = data_index.to_native_types(slicer=slicer, na_rep=self.na_rep, float_format=self.float_format)    1099 -> 1100         lib.write_csv_rows(self.data, ix, self.nlevels, self.cols, self.writer)    1101    1102 # from collections import namedtuple

/home/brechea/.local/lib/python2.6/site-packages/pandas-0.12.0-py2.6-linux-x86_64.egg/pandas/lib.so in pandas.lib.write_csv_rows (pandas/lib.c:13871)()

Error: need to escape, but no escapechar set

Adding the parameter

quotechar=kwds.get("quotechar")

to the

formatter = fmt.CSVFormatter(...

call in to_csv(), and doing corresponding changes to format.CSVFormatter()'s init() and save(), produces the expected output:

In [1]: import sys, csv

In [2]: from pandas import DataFrame

In [3]: data = {'col1': ['contents of col1 row1', 'contents " of col1 row2'], 'col2': ['contents of col2 row1', 'contents " of col2 row2'] }

In [4]: df = DataFrame(data)

In [5]: df.to_csv(sys.stdout, sep='\t', quoting=csv.QUOTE_NONE, quotechar=None) col1 col2 0 contents of col1 row1 contents of col2 row1 1 contents " of col1 row2 contents " of col2 row2

i.e., unescaped, unquoted tsv.

More generally, there could be many reasons to want more control of the underlying csv writer, so a generic mechanism (as opposed to adding each param one by one) might be called for (e.g., allowign for a csv dialect object or at least a dictionary holding dialect attributes).