PERF: ArrowExtensionArray.to_numpy(dtype=object) by lukemanley · Pull Request #51227 · pandas-dev/pandas (original) (raw)

dumb question: why isn't self._data.to_numpy() good enough here?

I believe there is a desire to maintain the raw dtype (even if wrapped in an object array) and use pd.NA by default.

For instance, this converts int to float when nulls are present:

In [1]: import pandas as pd

In [2]: arr = pd.array([1, None], dtype="int64[pyarrow]")

In [3]: arr._data.to_numpy()
Out[3]: array([ 1., nan])

I'm not sure its required, but the current implementation aligns mostly with BaseMaskedArray:

In [4]: arr = pd.array([1, None], dtype="Int64")

In [5]: arr.to_numpy()
Out[5]: array([1, <NA>], dtype=object)

In [6]: arr = pd.array([1, None], dtype="int64[pyarrow]")

In [7]: arr.to_numpy()
Out[7]: array([1, <NA>], dtype=object)