BUG:Floating point accuracy with DatetimeIndex.round (#14440) by mroeschke · Pull Request #15568 · pandas-dev/pandas (original) (raw)

Expand Up

@@ -175,6 +175,17 @@ def test_round(self):

tm.assertRaisesRegexp(ValueError, msg, rng.round, freq='M')

tm.assertRaisesRegexp(ValueError, msg, elt.round, freq='M')

# GH 14440

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also round to us/ns here as well (which should equal the original)

as an aside, you can now do parametrized tests (but have to move them to separate functions and not class based)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the rounding is happening correctly for microseconds but not with nanoseconds:

In [7]: pd.DatetimeIndex(['2016-10-17 12:00:00.0015']).round('ns')
Out[7]: DatetimeIndex(['2016-10-17 12:00:00.001499904'], dtype='datetime64[ns]', freq=None)

The rounding methodology seems sound. I am unsure if this is a limitation of the date going from int64 to float64 to int64 as this is essentially what is happening:

(Pdb) np.round(np.array([1476705600001500000])/1.).astype('i8')
array([1476705600001499904])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes this is just losing precision - not sure much can be done

we could warn / raise potentially though
would that be useful?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

created a new issue. #15578

index = pd.DatetimeIndex(['2016-10-17 12:00:00.0015'], tz=tz)

result = index.round('ms')

expected = pd.DatetimeIndex(['2016-10-17 12:00:00.002000'], tz=tz)

tm.assert_index_equal(result, expected)

index = pd.DatetimeIndex(['2016-10-17 12:00:00.00149'], tz=tz)

result = index.round('ms')

expected = pd.DatetimeIndex(['2016-10-17 12:00:00.001000'], tz=tz)

tm.assert_index_equal(result, expected)

def test_repeat_range(self):

rng = date_range('1/1/2000', '1/1/2001')

Expand Down