Allow timestamp option for StataWriter.write_file() · Issue #6545 · pandas-dev/pandas (original) (raw)

This is a combined feature request & minor bug notice.
Feature Request: I would like to be able to write code that produces, byte-for-byte, reproducible outputs. To that end I want to write Stata dta files with a blank (or constant) timestamp. It would be nice to allow write_file() to accept a timestamp (or some option to zero it out).

Bug: In an attempt to do this myself, I made my own version of StataWriter.write_file() where the only difference is I call (underscore)write_header() internal function with a constant timestamp. But that produces the following bug.

import pandas as pd import numpy as np from pandas.io.stata import StataWriter import datetime

df = pd.DataFrame(np.random.randn(6,4),index=list('abcdef'),columns=list('ABCD')) writer = StataWriter('ouput.dta', df) fktime_stamp = datetime.datetime.now() writer._write_header(time_stamp=fktime_stamp)

rest of write_file()

produces the following error

  File "C:\Program Files\Python27\lib\site-packages\pandas\io\stata.py", line 1057, in _write_header
    elif not isinstance(time_stamp, datetime):
TypeError: isinstance() arg 2 must be a class, type, or tuple of classes and types

My system details are.

>>> pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 2.7.2.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: AMD64 Family 16 Model 6 Stepping 3, AuthenticAMD
byteorder: little
LC_ALL: None
LANG: None

pandas: 0.13.1
Cython: None
numpy: 1.8.1
scipy: None
statsmodels: None
IPython: None
sphinx: None
patsy: None
scikits.timeseries: None
dateutil: 2.2
pytz: None
bottleneck: None
tables: None
numexpr: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
sqlalchemy: None
lxml: None
bs4: None
html5lib: None
bq: None
apiclient: None