API: SparseArray.astype behaviour to always preserve sparseness · Issue #34457 · pandas-dev/pandas (original) (raw)
Currently, the SparseArray.astype
function will always convert the specified target dtype to a sparse dtype, if it is not one. For example, this gives:
In [64]: arr = pd.arrays.SparseArray([1, 0, 0, 2])
In [65]: arr
Out[65]:
[1, 0, 0, 2]
Fill: 0
IntIndex
Indices: array([0, 3], dtype=int32)
In [66]: arr.astype(float)
Out[66]:
[1.0, 0.0, 0.0, 2.0]
Fill: 0.0
IntIndex
Indices: array([0, 3], dtype=int32)
This ensures that a simple astype
doesn't densify the sparse array (and you don't need to do astype(pd.SparseDtype(float, fill_value))
).
And note this also gives this behaviour to Series.astype(..)
But, this also gives the inconsistency that arr.astype(target_dtype).dtype != target_dtype
, so you can rely on the fact that you get back an array of the actual dtype that you specified.
See eg the workaround I need to add for this in #34338