GroupBy — pandas 2.2.3 documentation (original) (raw)

pandas.api.typing.DataFrameGroupBy and pandas.api.typing.SeriesGroupByinstances are returned by groupby calls pandas.DataFrame.groupby() andpandas.Series.groupby() respectively.

Indexing, iteration#

DataFrameGroupBy.__iter__() Groupby iterator.
SeriesGroupBy.__iter__() Groupby iterator.
DataFrameGroupBy.groups Dict {group name -> group labels}.
SeriesGroupBy.groups Dict {group name -> group labels}.
DataFrameGroupBy.indices Dict {group name -> group indices}.
SeriesGroupBy.indices Dict {group name -> group indices}.
DataFrameGroupBy.get_group(name[, obj]) Construct DataFrame from group with provided name.
SeriesGroupBy.get_group(name[, obj]) Construct DataFrame from group with provided name.
Grouper(*args, **kwargs) A Grouper allows the user to specify a groupby instruction for an object.

Function application helper#

NamedAgg(column, aggfunc) Helper for column specific aggregation with control over output column names.

Function application#

SeriesGroupBy.apply(func, *args, **kwargs) Apply function func group-wise and combine the results together.
DataFrameGroupBy.apply(func, *args[, ...]) Apply function func group-wise and combine the results together.
SeriesGroupBy.agg([func, engine, engine_kwargs]) Aggregate using one or more operations over the specified axis.
DataFrameGroupBy.agg([func, engine, ...]) Aggregate using one or more operations over the specified axis.
SeriesGroupBy.aggregate([func, engine, ...]) Aggregate using one or more operations over the specified axis.
DataFrameGroupBy.aggregate([func, engine, ...]) Aggregate using one or more operations over the specified axis.
SeriesGroupBy.transform(func, *args[, ...]) Call function producing a same-indexed Series on each group.
DataFrameGroupBy.transform(func, *args[, ...]) Call function producing a same-indexed DataFrame on each group.
SeriesGroupBy.pipe(func, *args, **kwargs) Apply a func with arguments to this GroupBy object and return its result.
DataFrameGroupBy.pipe(func, *args, **kwargs) Apply a func with arguments to this GroupBy object and return its result.
DataFrameGroupBy.filter(func[, dropna]) Filter elements from groups that don't satisfy a criterion.
SeriesGroupBy.filter(func[, dropna]) Filter elements from groups that don't satisfy a criterion.

DataFrameGroupBy computations / descriptive stats#

DataFrameGroupBy.all([skipna]) Return True if all values in the group are truthful, else False.
DataFrameGroupBy.any([skipna]) Return True if any value in the group is truthful, else False.
DataFrameGroupBy.bfill([limit]) Backward fill the values.
DataFrameGroupBy.corr([method, min_periods, ...]) Compute pairwise correlation of columns, excluding NA/null values.
DataFrameGroupBy.corrwith(other[, axis, ...]) Compute pairwise correlation.
DataFrameGroupBy.count() Compute count of group, excluding missing values.
DataFrameGroupBy.cov([min_periods, ddof, ...]) Compute pairwise covariance of columns, excluding NA/null values.
DataFrameGroupBy.cumcount([ascending]) Number each item in each group from 0 to the length of that group - 1.
DataFrameGroupBy.cummax([axis, numeric_only]) Cumulative max for each group.
DataFrameGroupBy.cummin([axis, numeric_only]) Cumulative min for each group.
DataFrameGroupBy.cumprod([axis]) Cumulative product for each group.
DataFrameGroupBy.cumsum([axis]) Cumulative sum for each group.
DataFrameGroupBy.describe([percentiles, ...]) Generate descriptive statistics.
DataFrameGroupBy.diff([periods, axis]) First discrete difference of element.
DataFrameGroupBy.ffill([limit]) Forward fill the values.
DataFrameGroupBy.fillna([value, method, ...]) (DEPRECATED) Fill NA/NaN values using the specified method within groups.
DataFrameGroupBy.first([numeric_only, ...]) Compute the first entry of each column within each group.
DataFrameGroupBy.head([n]) Return first n rows of each group.
DataFrameGroupBy.idxmax([axis, skipna, ...]) Return index of first occurrence of maximum over requested axis.
DataFrameGroupBy.idxmin([axis, skipna, ...]) Return index of first occurrence of minimum over requested axis.
DataFrameGroupBy.last([numeric_only, ...]) Compute the last entry of each column within each group.
DataFrameGroupBy.max([numeric_only, ...]) Compute max of group values.
DataFrameGroupBy.mean([numeric_only, ...]) Compute mean of groups, excluding missing values.
DataFrameGroupBy.median([numeric_only]) Compute median of groups, excluding missing values.
DataFrameGroupBy.min([numeric_only, ...]) Compute min of group values.
DataFrameGroupBy.ngroup([ascending]) Number each group from 0 to the number of groups - 1.
DataFrameGroupBy.nth Take the nth row from each group if n is an int, otherwise a subset of rows.
DataFrameGroupBy.nunique([dropna]) Return DataFrame with counts of unique elements in each position.
DataFrameGroupBy.ohlc() Compute open, high, low and close values of a group, excluding missing values.
DataFrameGroupBy.pct_change([periods, ...]) Calculate pct_change of each value to previous entry in group.
DataFrameGroupBy.prod([numeric_only, min_count]) Compute prod of group values.
DataFrameGroupBy.quantile([q, ...]) Return group values at the given quantile, a la numpy.percentile.
DataFrameGroupBy.rank([method, ascending, ...]) Provide the rank of values within each group.
DataFrameGroupBy.resample(rule, *args[, ...]) Provide resampling when using a TimeGrouper.
DataFrameGroupBy.rolling(*args, **kwargs) Return a rolling grouper, providing rolling functionality per group.
DataFrameGroupBy.sample([n, frac, replace, ...]) Return a random sample of items from each group.
DataFrameGroupBy.sem([ddof, numeric_only]) Compute standard error of the mean of groups, excluding missing values.
DataFrameGroupBy.shift([periods, freq, ...]) Shift each group by periods observations.
DataFrameGroupBy.size() Compute group sizes.
DataFrameGroupBy.skew([axis, skipna, ...]) Return unbiased skew within groups.
DataFrameGroupBy.std([ddof, engine, ...]) Compute standard deviation of groups, excluding missing values.
DataFrameGroupBy.sum([numeric_only, ...]) Compute sum of group values.
DataFrameGroupBy.var([ddof, engine, ...]) Compute variance of groups, excluding missing values.
DataFrameGroupBy.tail([n]) Return last n rows of each group.
DataFrameGroupBy.take(indices[, axis]) Return the elements in the given positional indices in each group.
DataFrameGroupBy.value_counts([subset, ...]) Return a Series or DataFrame containing counts of unique rows.

SeriesGroupBy computations / descriptive stats#

SeriesGroupBy.all([skipna]) Return True if all values in the group are truthful, else False.
SeriesGroupBy.any([skipna]) Return True if any value in the group is truthful, else False.
SeriesGroupBy.bfill([limit]) Backward fill the values.
SeriesGroupBy.corr(other[, method, min_periods]) Compute correlation with other Series, excluding missing values.
SeriesGroupBy.count() Compute count of group, excluding missing values.
SeriesGroupBy.cov(other[, min_periods, ddof]) Compute covariance with Series, excluding missing values.
SeriesGroupBy.cumcount([ascending]) Number each item in each group from 0 to the length of that group - 1.
SeriesGroupBy.cummax([axis, numeric_only]) Cumulative max for each group.
SeriesGroupBy.cummin([axis, numeric_only]) Cumulative min for each group.
SeriesGroupBy.cumprod([axis]) Cumulative product for each group.
SeriesGroupBy.cumsum([axis]) Cumulative sum for each group.
SeriesGroupBy.describe([percentiles, ...]) Generate descriptive statistics.
SeriesGroupBy.diff([periods, axis]) First discrete difference of element.
SeriesGroupBy.ffill([limit]) Forward fill the values.
SeriesGroupBy.fillna([value, method, axis, ...]) (DEPRECATED) Fill NA/NaN values using the specified method within groups.
SeriesGroupBy.first([numeric_only, ...]) Compute the first entry of each column within each group.
SeriesGroupBy.head([n]) Return first n rows of each group.
SeriesGroupBy.last([numeric_only, ...]) Compute the last entry of each column within each group.
SeriesGroupBy.idxmax([axis, skipna]) Return the row label of the maximum value.
SeriesGroupBy.idxmin([axis, skipna]) Return the row label of the minimum value.
SeriesGroupBy.is_monotonic_increasing Return whether each group's values are monotonically increasing.
SeriesGroupBy.is_monotonic_decreasing Return whether each group's values are monotonically decreasing.
SeriesGroupBy.max([numeric_only, min_count, ...]) Compute max of group values.
SeriesGroupBy.mean([numeric_only, engine, ...]) Compute mean of groups, excluding missing values.
SeriesGroupBy.median([numeric_only]) Compute median of groups, excluding missing values.
SeriesGroupBy.min([numeric_only, min_count, ...]) Compute min of group values.
SeriesGroupBy.ngroup([ascending]) Number each group from 0 to the number of groups - 1.
SeriesGroupBy.nlargest([n, keep]) Return the largest n elements.
SeriesGroupBy.nsmallest([n, keep]) Return the smallest n elements.
SeriesGroupBy.nth Take the nth row from each group if n is an int, otherwise a subset of rows.
SeriesGroupBy.nunique([dropna]) Return number of unique elements in the group.
SeriesGroupBy.unique() Return unique values for each group.
SeriesGroupBy.ohlc() Compute open, high, low and close values of a group, excluding missing values.
SeriesGroupBy.pct_change([periods, ...]) Calculate pct_change of each value to previous entry in group.
SeriesGroupBy.prod([numeric_only, min_count]) Compute prod of group values.
SeriesGroupBy.quantile([q, interpolation, ...]) Return group values at the given quantile, a la numpy.percentile.
SeriesGroupBy.rank([method, ascending, ...]) Provide the rank of values within each group.
SeriesGroupBy.resample(rule, *args[, ...]) Provide resampling when using a TimeGrouper.
SeriesGroupBy.rolling(*args, **kwargs) Return a rolling grouper, providing rolling functionality per group.
SeriesGroupBy.sample([n, frac, replace, ...]) Return a random sample of items from each group.
SeriesGroupBy.sem([ddof, numeric_only]) Compute standard error of the mean of groups, excluding missing values.
SeriesGroupBy.shift([periods, freq, axis, ...]) Shift each group by periods observations.
SeriesGroupBy.size() Compute group sizes.
SeriesGroupBy.skew([axis, skipna, numeric_only]) Return unbiased skew within groups.
SeriesGroupBy.std([ddof, engine, ...]) Compute standard deviation of groups, excluding missing values.
SeriesGroupBy.sum([numeric_only, min_count, ...]) Compute sum of group values.
SeriesGroupBy.var([ddof, engine, ...]) Compute variance of groups, excluding missing values.
SeriesGroupBy.tail([n]) Return last n rows of each group.
SeriesGroupBy.take(indices[, axis]) Return the elements in the given positional indices in each group.
SeriesGroupBy.value_counts([normalize, ...])

Plotting and visualization#