Floating point accuracy problems in DatetimeIndex.round · Issue #14440 · pandas-dev/pandas (original) (raw)

A small, complete example of the issue

There is a slight problem when using the rounding methods of DatetimeIndex (round, floor, ceil) to high frequencies as illustrated by this example:

pd.DatetimeIndex(['2016-10-17 12:00:00.0015']).round('ms') DatetimeIndex(['2016-10-17 12:00:00.001999872'], dtype='datetime64[ns]', freq=None)

The problem is here in the TimelikeOps._round method:

 result = (unit * rounder(values / float(unit))).astype('i8')

rounder(values / float(unit)) returns an array of floats containing the multiples of unit required. However, although the values look like ints, when multiplied by unit the result can be off due to floating point accuracy. Replacing it with

 result = (unit * rounder(values / float(unit)).astype('i8'))

Should fix the problem. I'm willing to do a PR to fix it.

Output of pd.show_versions()

pandas: 0.19.0
nose: None
pip: 8.1.2
setuptools: 27.2.0
Cython: None
numpy: 1.11.1
scipy: None
statsmodels: None
xarray: None
IPython: 5.1.0
sphinx: None
patsy: None
dateutil: 2.5.3
pytz: 2016.6.1
blosc: None
bottleneck: None
tables: 3.2.3.1
numexpr: 2.6.1
matplotlib: 1.5.3
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.13
pymysql: 0.7.6.None
psycopg2: None
jinja2: 2.8
boto: None