Error when writing non-ascii allowed characters to Stata dta · Issue #7286 · pandas-dev/pandas (original) (raw)
When trying to write a Stata dataset with strings containing upper latin-1 characters (which are allowed by the Stata format), I get an encoding error.
import pandas.io.stata as sta sr = sta.StataReader('pandas/pandas/io/tests/data/stata1_encoding.dta') df = sr.data() sw = sta.StataWriter('stata1_encoding_dup.dta', df) sw.write_file()
I get the following output:
Traceback (most recent call last): File "", line 1, in File "C:\Python27\lib\site-packages\pandas\io\stata.py", line 1242, in write_file self._write_data_nodates() File "C:\Python27\lib\site-packages\pandas\io\stata.py", line 1326, in _write_data_nodates self._write(var) File "C:\Python27\lib\site-packages\pandas\io\stata.py", line 1104, in _write self._file.write(to_write) UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 1:ordinal not in range(128)
Machine info:
INSTALLED VERSIONS
------------------
commit: None
python: 2.7.6.final.0
python-bits: 32
OS: Windows
OS-release: 7
machine: AMD64
processor: AMD64 Family 16 Model 6 Stepping 3, AuthenticAMD
byteorder: little
LC_ALL: None
LANG: None
pandas: 0.14.0
nose: 1.3.3
Cython: 0.20.1
numpy: 1.8.1
scipy: None
statsmodels: None
IPython: None
sphinx: None
patsy: None
scikits.timeseries: None
dateutil: 2.2
pytz: 2014.3
bottleneck: 0.8.0
tables: None
numexpr: None
matplotlib: None
openpyxl: 1.8.6
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
bq: None
apiclient: None
rpy2: None
sqlalchemy: None
pymysql: None
psycopg2: None