pandas.Categorical — pandas 0.24.0rc1 documentation (original) (raw)
class pandas.
Categorical
(values, categories=None, ordered=None, dtype=None, fastpath=False)[source]¶
Represents a categorical variable in classic R / S-plus fashion
Categoricals can only take on only a limited, and usually fixed, number of possible values (categories). In contrast to statistical categorical variables, a Categorical might have an order, but numerical operations (additions, divisions, …) are not possible.
All values of the Categorical are either in categories or np.nan. Assigning values outside of categories will raise a ValueError. Order is defined by the order of the categories, not lexical order of the values.
Parameters: | values : list-like The values of the categorical. If categories are given, values not in categories will be replaced with NaN. categories : Index-like (unique), optional The unique categories for this categorical. If not given, the categories are assumed to be the unique values of values (sorted, if possible, otherwise in the order in which they appear). ordered : boolean, (default False) Whether or not this categorical is treated as a ordered categorical. If True, the resulting categorical will be ordered. An ordered categorical respects, when sorted, the order of itscategories attribute (which in turn is the categories argument, if provided). dtype : CategoricalDtype An instance of CategoricalDtype to use for this categorical New in version 0.21.0. |
---|---|
Raises: | ValueError If the categories do not validate. TypeError If an explicit ordered=True is given but no categories and thevalues are not sortable. |
See also
pandas.api.types.CategoricalDtype
Type for categorical data.
An Index with an underlying Categorical
.
Notes
See the user guide for more.
Examples
pd.Categorical([1, 2, 3, 1, 2, 3]) [1, 2, 3, 1, 2, 3] Categories (3, int64): [1, 2, 3]
pd.Categorical(['a', 'b', 'c', 'a', 'b', 'c']) [a, b, c, a, b, c] Categories (3, object): [a, b, c]
Ordered Categoricals can be sorted according to the custom order of the categories and can have a min and max value.
c = pd.Categorical(['a','b','c','a','b','c'], ordered=True, ... categories=['c', 'b', 'a']) c [a, b, c, a, b, c] Categories (3, object): [c < b < a] c.min() 'c'
Attributes
categories | The categories of this categorical. |
---|---|
codes | The category codes of this categorical. |
ordered | Whether the categories have an ordered relationship. |
dtype | The CategoricalDtype for this instance |
Methods
from_codes(codes[, categories, ordered, dtype]) | Make a Categorical type from codes and categories or dtype. |
---|---|
__array__([dtype]) | The numpy array interface. |