pandas.api.extensions.ExtensionArray._values_for_factorize — pandas 2.2.3 documentation (original) (raw)

ExtensionArray._values_for_factorize()[source]#

Return an array and missing value suitable for factorization.

Returns:

valuesndarray

An array suitable for factorization. This should maintain order and be a supported dtype (Float64, Int64, UInt64, String, Object). By default, the extension array is cast to object dtype.

na_valueobject

The value in values to consider missing. This will be treated as NA in the factorization routines, so it will be coded as-1 and not included in uniques. By default,np.nan is used.

Notes

The values returned by this method are also used inpandas.util.hash_pandas_object(). If needed, this can be overridden in the self._hash_pandas_object() method.

Examples

pd.array([1, 2, 3])._values_for_factorize() (array([1, 2, 3], dtype=object), nan)