pandas.api.types.union_categoricals — pandas 2.2.3 documentation (original) (raw)
pandas.api.types.union_categoricals(to_union, sort_categories=False, ignore_order=False)[source]#
Combine list-like of Categorical-like, unioning categories.
All categories must have the same dtype.
Parameters:
to_unionlist-like
Categorical, CategoricalIndex, or Series with dtype=’category’.
sort_categoriesbool, default False
If true, resulting categories will be lexsorted, otherwise they will be ordered as they appear in the data.
ignore_orderbool, default False
If true, the ordered attribute of the Categoricals will be ignored. Results in an unordered categorical.
Returns:
Categorical
Raises:
TypeError
- all inputs do not have the same dtype
- all inputs do not have the same ordered property
- all inputs are ordered and their categories are not identical
- sort_categories=True and Categoricals are ordered
ValueError
Empty list of categoricals passed
Notes
To learn more about categories, see link
Examples
If you want to combine categoricals that do not necessarily have the same categories, union_categoricals will combine a list-like of categoricals. The new categories will be the union of the categories being combined.
a = pd.Categorical(["b", "c"]) b = pd.Categorical(["a", "b"]) pd.api.types.union_categoricals([a, b]) ['b', 'c', 'a', 'b'] Categories (3, object): ['b', 'c', 'a']
By default, the resulting categories will be ordered as they appear in the categories of the data. If you want the categories to be lexsorted, use sort_categories=True argument.
pd.api.types.union_categoricals([a, b], sort_categories=True) ['b', 'c', 'a', 'b'] Categories (3, object): ['a', 'b', 'c']
union_categoricals also works with the case of combining two categoricals of the same categories and order information (e.g. what you could also append for).
a = pd.Categorical(["a", "b"], ordered=True) b = pd.Categorical(["a", "b", "a"], ordered=True) pd.api.types.union_categoricals([a, b]) ['a', 'b', 'a', 'b', 'a'] Categories (2, object): ['a' < 'b']
Raises TypeError because the categories are ordered and not identical.
a = pd.Categorical(["a", "b"], ordered=True) b = pd.Categorical(["a", "b", "c"], ordered=True) pd.api.types.union_categoricals([a, b]) Traceback (most recent call last): ... TypeError: to union ordered Categoricals, all categories must be the same
Ordered categoricals with different categories or orderings can be combined by using the ignore_ordered=True argument.
a = pd.Categorical(["a", "b", "c"], ordered=True) b = pd.Categorical(["c", "b", "a"], ordered=True) pd.api.types.union_categoricals([a, b], ignore_order=True) ['a', 'b', 'c', 'c', 'b', 'a'] Categories (3, object): ['a', 'b', 'c']
union_categoricals also works with a CategoricalIndex, or Seriescontaining categorical data, but note that the resulting array will always be a plain Categorical
a = pd.Series(["b", "c"], dtype='category') b = pd.Series(["a", "b"], dtype='category') pd.api.types.union_categoricals([a, b]) ['b', 'c', 'a', 'b'] Categories (3, object): ['b', 'c', 'a']