pandas.Index.factorize — pandas 0.24.0rc1 documentation (original) (raw)

Index. factorize(sort=False, na_sentinel=-1)[source]¶

Encode the object as an enumerated type or categorical variable.

This method is useful for obtaining a numeric representation of an array when all that matters is identifying distinct values. factorizeis available as both a top-level function pandas.factorize(), and as a method Series.factorize() and Index.factorize().

Parameters:	sort : boolean, default False Sort uniques and shuffle labels to maintain the relationship. na_sentinel : int, default -1 Value to mark “not found”.
Returns:	labels : ndarray An integer ndarray that’s an indexer into uniques.uniques.take(labels) will have the same values as values. uniques : ndarray, Index, or Categorical The unique valid values. When values is Categorical, uniquesis a Categorical. When values is some other pandas object, anIndex is returned. Otherwise, a 1-D ndarray is returned. Note Even if there’s a missing value in values, uniques will_not_ contain an entry for it.

Parameters:

sort : boolean, default False Sort uniques and shuffle labels to maintain the relationship. na_sentinel : int, default -1 Value to mark “not found”.

Returns:

labels : ndarray An integer ndarray that’s an indexer into uniques.uniques.take(labels) will have the same values as values. uniques : ndarray, Index, or Categorical The unique valid values. When values is Categorical, uniquesis a Categorical. When values is some other pandas object, anIndex is returned. Otherwise, a 1-D ndarray is returned. Note Even if there’s a missing value in values, uniques will_not_ contain an entry for it.