ERR: validate encoding on to_stata · Issue #15723 · pandas-dev/pandas (original) (raw)

It seems pandas in python3.5 causes issues due to encoding. For example the following generates a corrupt output file

import pandas as pd df1 = pd.DataFrame(np.array([1,2,3,4]), columns=['var1']) df1.to_stata('corrupt.dta', write_index=False, encoding='utf8')

while

df1.to_stata('not-corrupt.dta', write_index=False)

generates a correct file. I imagine this may be due to use of encoding and the difference in the treatment between python 2 and python 3, which breaks compatibility of scripts across python versions. I guess it would be nice if it does not take this option into account on python 3, unless the error is caused by something else.