BUG/REF: ArrowExtensionArray non-nanosecond units by lukemanley · Pull Request #53171 · pandas-dev/pandas (original) (raw)

Fixes a few bugs related to non-nanosecond units for pyarrow duration and timestamp types.

The underlying issue is upstream (apache/arrow#33321) where pyarrow does not handle pandas Timedelta and Timestamp non-nano units correctly.

Example (1 seconds -> 0 seconds):

In [1]: import pandas as pd

In [2]: import pyarrow as pa

In [3]: td = pd.Timedelta(1, unit="s").as_unit("s")

In [4]: pa.scalar(td, type=pa.duration("s"))
Out[4]: <pyarrow.DurationScalar: datetime.timedelta(0)>

In [5]: pa.array([td], type=pa.duration("s"))
Out[5]: 
<pyarrow.lib.DurationArray object at 0x10751d120>
[
  0
]