pandas.from_dummies — pandas 2.2.3 documentation (original) (raw)

pandas.from_dummies(data, sep=None, default_category=None)[source]#

Create a categorical DataFrame from a DataFrame of dummy variables.

Inverts the operation performed by get_dummies().

Added in version 1.5.0.

Parameters:

dataDataFrame

Data which contains dummy-coded variables in form of integer columns of 1’s and 0’s.

sepstr, default None

Separator used in the column names of the dummy categories they are character indicating the separation of the categorical names from the prefixes. For example, if your column names are ‘prefix_A’ and ‘prefix_B’, you can strip the underscore by specifying sep=’_’.

default_categoryNone, Hashable or dict of Hashables, default None

The default category is the implied category when a value has none of the listed categories specified with a one, i.e. if all dummies in a row are zero. Can be a single value for all variables or a dict directly mapping the default categories to a prefix of a variable.

Returns:

DataFrame

Categorical data decoded from the dummy input-data.

Raises:

ValueError

TypeError

See also

get_dummies()

Convert Series or DataFrame to dummy codes.

Categorical

Represent a categorical variable in classic.

Notes

The columns of the passed dummy data should only include 1’s and 0’s, or boolean values.

Examples

df = pd.DataFrame({"a": [1, 0, 0, 1], "b": [0, 1, 0, 0], ... "c": [0, 0, 1, 0]})

df a b c 0 1 0 0 1 0 1 0 2 0 0 1 3 1 0 0

pd.from_dummies(df) 0 a 1 b 2 c 3 a

df = pd.DataFrame({"col1_a": [1, 0, 1], "col1_b": [0, 1, 0], ... "col2_a": [0, 1, 0], "col2_b": [1, 0, 0], ... "col2_c": [0, 0, 1]})

df col1_a col1_b col2_a col2_b col2_c 0 1 0 0 1 0 1 0 1 1 0 0 2 1 0 0 0 1

pd.from_dummies(df, sep="_") col1 col2 0 a b 1 b a 2 a c

df = pd.DataFrame({"col1_a": [1, 0, 0], "col1_b": [0, 1, 0], ... "col2_a": [0, 1, 0], "col2_b": [1, 0, 0], ... "col2_c": [0, 0, 0]})

df col1_a col1_b col2_a col2_b col2_c 0 1 0 0 1 0 1 0 1 1 0 0 2 0 0 0 0 0

pd.from_dummies(df, sep="_", default_category={"col1": "d", "col2": "e"}) col1 col2 0 a b 1 b a 2 d e