pandas.arrays.SparseArray — pandas 2.2.3 documentation (original) (raw)
class pandas.arrays.SparseArray(data, sparse_index=None, fill_value=None, kind='integer', dtype=None, copy=False)[source]#
An ExtensionArray for storing sparse data.
Parameters:
dataarray-like or scalar
A dense array of values to store in the SparseArray. This may containfill_value.
sparse_indexSparseIndex, optional
fill_valuescalar, optional
Elements in data that are fill_value
are not stored in the SparseArray. For memory savings, this should be the most common value in data. By default, fill_value depends on the dtype of data:
data.dtype | na_value |
---|---|
float | np.nan |
int | 0 |
bool | False |
datetime64 | pd.NaT |
timedelta64 | pd.NaT |
The fill value is potentially specified in three ways. In order of precedence, these are
- The fill_value argument
dtype.fill_value
if fill_value is None and dtype is aSparseDtype
data.dtype.fill_value
if fill_value is None and dtypeis not aSparseDtype
and data is aSparseArray
.
kindstr
Can be ‘integer’ or ‘block’, default is ‘integer’. The type of storage for sparse locations.
- ‘block’: Stores a block and block_length for each contiguous span of sparse values. This is best when sparse data tends to be clumped together, with large regions of
fill-value
values between sparse values. - ‘integer’: uses an integer to store the location of each sparse value.
dtypenp.dtype or SparseDtype, optional
The dtype to use for the SparseArray. For numpy dtypes, this determines the dtype of self.sp_values
. For SparseDtype, this determines self.sp_values
and self.fill_value
.
copybool, default False
Whether to explicitly copy the incoming data array.
Attributes
Methods
Examples
from pandas.arrays import SparseArray arr = SparseArray([0, 0, 1, 2]) arr [0, 0, 1, 2] Fill: 0 IntIndex Indices: array([2, 3], dtype=int32)