Floating point accuracy problems in DatetimeIndex.round
· Issue #14440 · pandas-dev/pandas (original) (raw)
A small, complete example of the issue
There is a slight problem when using the rounding methods of DatetimeIndex
(round, floor, ceil
) to high frequencies as illustrated by this example:
pd.DatetimeIndex(['2016-10-17 12:00:00.0015']).round('ms') DatetimeIndex(['2016-10-17 12:00:00.001999872'], dtype='datetime64[ns]', freq=None)
The problem is here in the TimelikeOps._round
method:
result = (unit * rounder(values / float(unit))).astype('i8')
rounder(values / float(unit))
returns an array of floats
containing the multiples of unit
required. However, although the values look like ints
, when multiplied by unit
the result can be off due to floating point accuracy. Replacing it with
result = (unit * rounder(values / float(unit)).astype('i8'))
Should fix the problem. I'm willing to do a PR to fix it.
Output of pd.show_versions()
pandas: 0.19.0
nose: None
pip: 8.1.2
setuptools: 27.2.0
Cython: None
numpy: 1.11.1
scipy: None
statsmodels: None
xarray: None
IPython: 5.1.0
sphinx: None
patsy: None
dateutil: 2.5.3
pytz: 2016.6.1
blosc: None
bottleneck: None
tables: 3.2.3.1
numexpr: 2.6.1
matplotlib: 1.5.3
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.13
pymysql: 0.7.6.None
psycopg2: None
jinja2: 2.8
boto: None