Rounding errors with Timestamps and .last() · Issue #19526 · pandas-dev/pandas (original) (raw)

Code Sample, a copy-pastable example if possible

In [1]: import pandas as pd In [3]: ts = pd.Timestamp('2016-10-14 21:00:44.557')
In [4]: pd.DataFrame({'a': range(2), 'b': [ts, pd.NaT]}).groupby('a').b.last()
Out[4]:
a
0 2016-10-14 21:00:44.556999936
1 NaT
Name: b, dtype: datetime64[ns]

Problem description

The value of the Timestamp in the output is not equal to the value in the input.

Expected Output

Out[4]:
a
0 2016-10-14 21:00:44.557
1 NaT
Name: b, dtype: datetime64[ns]

The issue is a bit subtle. Here are three related examples, all of which do work as expected:

I haven't been able to repro with .nth(-1) in place of last():

In [1]: import pandas as pd In [3]: ts = pd.Timestamp('2016-10-14 21:00:44.557') In [5]: pd.DataFrame({'a': range(2), 'b': [ts, pd.NaT]}).groupby('a').b.nth(-1)
Out[5]:
a
0 2016-10-14 21:00:44.557
1 NaT
Name: b, dtype: datetime64[ns]

I haven't been able to repro if NaT is not present:

In [6]: pd.DataFrame({'a': range(2), 'b': [ts, ts]}).groupby('a').b.last()
Out[6]: 
a
0   2016-10-14 21:00:44.557
1   2016-10-14 21:00:44.557
Name: b, dtype: datetime64[ns]

I haven't been able to repro for times which are integer seconds:

In [7]: ts = pd.Timestamp('2016-10-14 21:00:44')
In [8]: pd.DataFrame({'a': range(2), 'b': [ts, pd.NaT]}).groupby('a').b.last()
Out[8]: 
a
0   2016-10-14 21:00:44
1                   NaT
Name: b, dtype: datetime64[ns]

Output of pd.show_versions()

INSTALLED VERSIONS                                                                     
------------------                                                                     
commit: None                                                                           
python: 3.6.3.final.0                                                                  
python-bits: 64                                                                        
OS: Linux                                                                              
OS-release: 4.13.0-32-generic                                                          
machine: x86_64                                                                        
processor: x86_64                                                                      
byteorder: little                                                                      
LC_ALL: None                                                                           
LANG: en_US.UTF-8                                                                      
LOCALE: en_US.UTF-8                                                                    

pandas: 0.22.0                                                                         
pytest: 3.2.1                                                                          
pip: 9.0.1                                                                             
setuptools: 38.2.4                                                                     
Cython: 0.26.1                                                                         
numpy: 1.14.0                                                                          
scipy: 0.19.1                                                                          
pyarrow: None                                                                          
xarray: None                                                                           
IPython: 6.1.0                                                                         
sphinx: 1.6.3                                                                          
patsy: 0.4.1                                                                           
dateutil: 2.6.1                                                                        
pytz: 2017.3                                                                           
blosc: None                                                                            
bottleneck: 1.2.1                                                                      
tables: 3.4.2                                                                          
numexpr: 2.6.2                                                                         
feather: None                                                                          
matplotlib: 2.1.0                                                                      
openpyxl: 2.4.8                                                                        
xlrd: 1.1.0                                                                            
xlwt: 1.3.0                                                                            
xlsxwriter: 1.0.2                                                                      
lxml: 4.1.0                                                                            
bs4: 4.6.0                                                                             
html5lib: 0.9999999                                                                    
sqlalchemy: 1.1.13                                                                     
pymysql: None                                                                          
psycopg2: None                                                                         
jinja2: 2.9.6                                                                          
s3fs: None                                                                             
fastparquet: None                                                                      
pandas_gbq: None                                                                       
pandas_datareader: None