pandas.core.groupby.SeriesGroupBy.transform — pandas 2.2.3 documentation (original) (raw)

SeriesGroupBy.transform(func, *args, engine=None, engine_kwargs=None, **kwargs)[source]#

Call function producing a same-indexed Series on each group.

Returns a Series having the same indexes as the original object filled with the transformed values.

Parameters:

ffunction, str

Function to apply to each group. See the Notes section below for requirements.

Accepted inputs are:

Only passing a single function is supported with this engine. If the 'numba' engine is chosen, the function must be a user defined function with values and index as the first and second arguments respectively in the function signature. Each group’s index will be passed to the user defined function and optionally available for use.

If a string is chosen, then it needs to be the name of the groupby method you want to use.

*args

Positional arguments to pass to func.

enginestr, default None

engine_kwargsdict, default None

**kwargs

Keyword arguments to be passed into func.

Returns:

Series

See also

Series.groupby.apply

Apply function func group-wise and combine the results together.

Series.groupby.aggregate

Aggregate using one or more operations over the specified axis.

Series.transform

Call func on self producing a Series with the same axis shape as self.

Notes

Each group is endowed the attribute ‘name’ in case you need to know which group you are working on.

The current implementation imposes three requirements on f:

When using engine='numba', there will be no “fall back” behavior internally. The group data and group index will be passed as numpy arrays to the JITed user defined function, and no alternative execution attempts will be tried.

Changed in version 1.3.0: The resulting dtype will reflect the return value of the passed func, see the examples below.

Changed in version 2.0.0: When using .transform on a grouped DataFrame and the transformation function returns a DataFrame, pandas now aligns the result’s index with the input’s index. You can call .to_numpy() on the result of the transformation function to avoid alignment.

Examples

ser = pd.Series([390.0, 350.0, 30.0, 20.0], ... index=["Falcon", "Falcon", "Parrot", "Parrot"], ... name="Max Speed") grouped = ser.groupby([1, 1, 2, 2]) grouped.transform(lambda x: (x - x.mean()) / x.std()) Falcon 0.707107 Falcon -0.707107 Parrot 0.707107 Parrot -0.707107 Name: Max Speed, dtype: float64

Broadcast result of the transformation

grouped.transform(lambda x: x.max() - x.min()) Falcon 40.0 Falcon 40.0 Parrot 10.0 Parrot 10.0 Name: Max Speed, dtype: float64

grouped.transform("mean") Falcon 370.0 Falcon 370.0 Parrot 25.0 Parrot 25.0 Name: Max Speed, dtype: float64

Changed in version 1.3.0.

The resulting dtype will reflect the return value of the passed func, for example:

grouped.transform(lambda x: x.astype(int).max()) Falcon 390 Falcon 390 Parrot 30 Parrot 30 Name: Max Speed, dtype: int64