Rounding errors with Timestamps and .last()
· Issue #19526 · pandas-dev/pandas (original) (raw)
Code Sample, a copy-pastable example if possible
In [1]: import pandas as pd
In [3]: ts = pd.Timestamp('2016-10-14 21:00:44.557')
In [4]: pd.DataFrame({'a': range(2), 'b': [ts, pd.NaT]}).groupby('a').b.last()
Out[4]:
a
0 2016-10-14 21:00:44.556999936
1 NaT
Name: b, dtype: datetime64[ns]
Problem description
The value of the Timestamp in the output is not equal to the value in the input.
Expected Output
Out[4]:
a
0 2016-10-14 21:00:44.557
1 NaT
Name: b, dtype: datetime64[ns]
The issue is a bit subtle. Here are three related examples, all of which do work as expected:
I haven't been able to repro with .nth(-1)
in place of last()
:
In [1]: import pandas as pd
In [3]: ts = pd.Timestamp('2016-10-14 21:00:44.557')
In [5]: pd.DataFrame({'a': range(2), 'b': [ts, pd.NaT]}).groupby('a').b.nth(-1)
Out[5]:
a
0 2016-10-14 21:00:44.557
1 NaT
Name: b, dtype: datetime64[ns]
I haven't been able to repro if NaT
is not present:
In [6]: pd.DataFrame({'a': range(2), 'b': [ts, ts]}).groupby('a').b.last()
Out[6]:
a
0 2016-10-14 21:00:44.557
1 2016-10-14 21:00:44.557
Name: b, dtype: datetime64[ns]
I haven't been able to repro for times which are integer seconds:
In [7]: ts = pd.Timestamp('2016-10-14 21:00:44')
In [8]: pd.DataFrame({'a': range(2), 'b': [ts, pd.NaT]}).groupby('a').b.last()
Out[8]:
a
0 2016-10-14 21:00:44
1 NaT
Name: b, dtype: datetime64[ns]
Output of pd.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.6.3.final.0
python-bits: 64
OS: Linux
OS-release: 4.13.0-32-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.22.0
pytest: 3.2.1
pip: 9.0.1
setuptools: 38.2.4
Cython: 0.26.1
numpy: 1.14.0
scipy: 0.19.1
pyarrow: None
xarray: None
IPython: 6.1.0
sphinx: 1.6.3
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.2
feather: None
matplotlib: 2.1.0
openpyxl: 2.4.8
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.0.2
lxml: 4.1.0
bs4: 4.6.0
html5lib: 0.9999999
sqlalchemy: 1.1.13
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None