PERF: ArrowExtensionArray.to_numpy(dtype=object) by lukemanley · Pull Request #51227 · pandas-dev/pandas (original) (raw)
dumb question: why isn't
self._data.to_numpy()
good enough here?
I believe there is a desire to maintain the raw dtype (even if wrapped in an object array) and use pd.NA
by default.
For instance, this converts int to float when nulls are present:
In [1]: import pandas as pd
In [2]: arr = pd.array([1, None], dtype="int64[pyarrow]")
In [3]: arr._data.to_numpy()
Out[3]: array([ 1., nan])
I'm not sure its required, but the current implementation aligns mostly with BaseMaskedArray
:
In [4]: arr = pd.array([1, None], dtype="Int64")
In [5]: arr.to_numpy()
Out[5]: array([1, <NA>], dtype=object)
In [6]: arr = pd.array([1, None], dtype="int64[pyarrow]")
In [7]: arr.to_numpy()
Out[7]: array([1, <NA>], dtype=object)