BUG: HDFStore doesn't save datetime64[s] right (original) (raw)
Pandas version checks
- I have checked that this issue has not already been reported.
- I have confirmed this bug exists on the latest version of pandas.
- I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
**** This doesn't work
import pandas as pd df = pd.DataFrame(['2001-01-01', '2002-02-02'], dtype='datetime64[s]') print(df)
Prints
0
0 2001-01-01
1 2002-02-02
print(df.dtypes)
Prints
0 datetime64[s]
dtype: object
with pd.HDFStore('deleteme.h5', mode='w') as store: store.put( 'df', df, )
with pd.HDFStore('deleteme.h5', mode='r') as store: df_fromstore = store.get('df') print(df_fromstore)
Prints
0
0 1970-01-01 00:00:00.978307200
1 1970-01-01 00:00:01.012608000
Delete created file
import pathlib pathlib.Path('deleteme.h5').unlink()
**** However this does work
import pandas as pd df = pd.DataFrame(['2001-01-01', '2002-02-02'], dtype='datetime64[ns]') print(df)
Prints
0
0 2001-01-01
1 2002-02-02
print(df.dtypes)
Prints
0 datetime64[ns]
dtype: object
with pd.HDFStore('deleteme.h5', mode='w') as store: store.put( 'df', df, )
with pd.HDFStore('deleteme.h5', mode='r') as store: df_fromstore = store.get('df') print(df_fromstore)
Prints
0
0 2001-01-01
1 2002-02-02
Delete created file
pathlib.Path('deleteme.h5').unlink()
Issue Description
When saving to a file using .h5 (HDF5 format), if a column is of type datetime64[s] (s not ns) and then loading the file, the date isn't loaded correctly. My intuition tells me that the file gets saved to datetime64[s] but when read Pandas parses the date as datetime64[ns] and I guess some more digits should be present if parsing to datetime64[ns]. I haven't done any tests with other datetime precisions.
Expected Behavior
When saving a datetime64[s] to HDF5 file format and loading that file, the date should be correctly loaded.
Installed Versions
UserWarning: Setuptools is replacing distutils. warnings.warn("Setuptools is replacing distutils.")
INSTALLED VERSIONS
commit : d9cdd2ee5a58015ef6f4d15c7226110c9aab8140 python : 3.11.9.final.0 python-bits : 64 OS : Darwin OS-release : 23.5.0 Version : Darwin Kernel Version 23.5.0: Wed May 1 20:12:58 PDT 2024; root:xnu-10063.121.3~5/RELEASE_ARM64_T6000 machine : arm64 processor : arm byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8
pandas : 2.2.2 numpy : 1.26.4 pytz : 2024.1 dateutil : 2.9.0 setuptools : 70.0.0 pip : 24.0 Cython : 3.0.8 pytest : 8.0.0 hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : 3.1.9 lxml.etree : 5.1.0 html5lib : 1.1 pymysql : None psycopg2 : None jinja2 : 3.1.3 IPython : 8.21.0 pandas_datareader : None adbc-driver-postgresql: None adbc-driver-sqlite : None bs4 : 4.12.3 bottleneck : None dataframe-api-compat : None fastparquet : None fsspec : None gcsfs : None matplotlib : 3.8.2 numba : None numexpr : 2.10.0 odfpy : None openpyxl : 3.1.4 pandas_gbq : None pyarrow : None pyreadstat : None python-calamine : None pyxlsb : None s3fs : None scipy : 1.12.0 sqlalchemy : None tables : 3.9.2 tabulate : None xarray : None xlrd : None zstandard : None tzdata : 2024.1 qtpy : None pyqt5 : None