Improve to_datetime bounds checking by rebecca-palmer · Pull Request #50183 · pandas-dev/pandas (original) (raw)
to_datetime(errors='raise') is supposed to raise an exception on out-of-bounds input. However, it uses rounding to integer for this bounds checking but not the actual conversion, and hence, for input just outside the bounds it may instead return NaT (on x86) or output clipped to the bounds (on arm).
It also assumes that converting NaN to int gives NaT, which may not be true on unusual hardware. (This was how I originally noticed the problem, as Debian tries to build packages on a wide range of hardware.)
This patch fixes both issues. (In the Debian logs linked here, search for 'near-limits test', the first run is unpatched 1.3.5, the second run is patched 1.5.x.)
Its effect on speed is unclear: it is sometimes faster and sometimes slower, possibly because other load on the build machines varies.
- [this bug does not appear to be previously reported ] closes #xxxx (Replace xxxx with the GitHub issue number)
- Tests added and passed if fixing a bug or adding a new feature
- All code checks passed.
- [N/A ] Added type annotations to new arguments/methods/functions.
- Added an entry in the latest
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.