BUG/REF: ArrowExtensionArray non-nanosecond units by lukemanley · Pull Request #53171 · pandas-dev/pandas (original) (raw)
- Tests added and passed if fixing a bug or adding a new feature
- All code checks passed.
- Added type annotations to new arguments/methods/functions.
- Added an entry in the latest
doc/source/whatsnew/v2.1.0.rst
file if fixing a bug or adding a new feature.
Fixes a few bugs related to non-nanosecond units for pyarrow duration and timestamp types.
The underlying issue is upstream (apache/arrow#33321) where pyarrow does not handle pandas Timedelta
and Timestamp
non-nano units correctly.
Example (1 seconds -> 0 seconds):
In [1]: import pandas as pd
In [2]: import pyarrow as pa
In [3]: td = pd.Timedelta(1, unit="s").as_unit("s")
In [4]: pa.scalar(td, type=pa.duration("s"))
Out[4]: <pyarrow.DurationScalar: datetime.timedelta(0)>
In [5]: pa.array([td], type=pa.duration("s"))
Out[5]:
<pyarrow.lib.DurationArray object at 0x10751d120>
[
0
]