Avoid double copy in Series/Index.to_numpy · Issue #24345 · pandas-dev/pandas (original) (raw)

#24341 added the copy parameter to to_numpy().

In some cases, that will copy data twice, when the asarray(self._values, dtype=dtype) makes a copy, Series / Index doesn't know about it.

To avoid that, we could add a method to the ExtensionArray developer API

def _to_numpy(self, dtype=None copy=False): -> Union[ndarray, bool] # returns ndarray, did_copy

The second item in the return tuple indicates whether the returned ndarray is already a copy, or whether it's a zero-copy view on some other data.

Then, back in Series.to_numpy we do

arr, copied = self.array._to_numpy(dtype, copy) if copy and not copied: arr = arr.copy()

return arr

This is too much complexity to rush into 0.24