GroupBy — pandas 3.0.0.dev0+2097.gcdc5b7418e documentation (original) (raw)

pandas.api.typing.DataFrameGroupBy and pandas.api.typing.SeriesGroupByinstances are returned by groupby calls pandas.DataFrame.groupby() andpandas.Series.groupby() respectively.

Indexing, iteration#

Grouper(*args, **kwargs) A Grouper allows the user to specify a groupby instruction for an object.

Function application helper#

NamedAgg(column, aggfunc) Helper for column specific aggregation with control over output column names.

Function application#

SeriesGroupBy.apply(func, *args, **kwargs) Apply function func group-wise and combine the results together.
DataFrameGroupBy.apply(func, *args[, ...]) Apply function func group-wise and combine the results together.
SeriesGroupBy.agg([func, engine, engine_kwargs]) Aggregate using one or more operations.
DataFrameGroupBy.agg([func, engine, ...]) Aggregate using one or more operations.
SeriesGroupBy.aggregate([func, engine, ...]) Aggregate using one or more operations.
DataFrameGroupBy.aggregate([func, engine, ...]) Aggregate using one or more operations.
SeriesGroupBy.transform(func, *args[, ...]) Call function producing a same-indexed Series on each group.
DataFrameGroupBy.transform(func, *args[, ...]) Call function producing a same-indexed DataFrame on each group.
SeriesGroupBy.pipe(func, *args, **kwargs) Apply a func with arguments to this GroupBy object and return its result.
DataFrameGroupBy.pipe(func, *args, **kwargs) Apply a func with arguments to this GroupBy object and return its result.
DataFrameGroupBy.filter(func[, dropna]) Filter elements from groups that don't satisfy a criterion.
SeriesGroupBy.filter(func[, dropna]) Filter elements from groups that don't satisfy a criterion.

DataFrameGroupBy computations / descriptive stats#

DataFrameGroupBy.all([skipna]) Return True if all values in the group are truthful, else False.
DataFrameGroupBy.any([skipna]) Return True if any value in the group is truthful, else False.
DataFrameGroupBy.bfill([limit]) Backward fill the values.
DataFrameGroupBy.corr([method, min_periods, ...]) Compute pairwise correlation of columns, excluding NA/null values.
DataFrameGroupBy.corrwith(other[, drop, ...]) (DEPRECATED) Compute pairwise correlation.
DataFrameGroupBy.count() Compute count of group, excluding missing values.
DataFrameGroupBy.cov([min_periods, ddof, ...]) Compute pairwise covariance of columns, excluding NA/null values.
DataFrameGroupBy.cumcount([ascending]) Number each item in each group from 0 to the length of that group - 1.
DataFrameGroupBy.cummax([numeric_only]) Cumulative max for each group.
DataFrameGroupBy.cummin([numeric_only]) Cumulative min for each group.
DataFrameGroupBy.cumprod([numeric_only]) Cumulative product for each group.
DataFrameGroupBy.cumsum([numeric_only]) Cumulative sum for each group.
DataFrameGroupBy.describe([percentiles, ...]) Generate descriptive statistics.
DataFrameGroupBy.diff([periods]) First discrete difference of element.
DataFrameGroupBy.ewm([com, span, halflife, ...]) Return an ewm grouper, providing ewm functionality per group.
DataFrameGroupBy.expanding([min_periods, method]) Return an expanding grouper, providing expanding functionality per group.
DataFrameGroupBy.ffill([limit]) Forward fill the values.
DataFrameGroupBy.first([numeric_only, ...]) Compute the first entry of each column within each group.
DataFrameGroupBy.head([n]) Return first n rows of each group.
DataFrameGroupBy.idxmax([skipna, numeric_only]) Return index of first occurrence of maximum in each group.
DataFrameGroupBy.idxmin([skipna, numeric_only]) Return index of first occurrence of minimum in each group.
DataFrameGroupBy.last([numeric_only, ...]) Compute the last entry of each column within each group.
DataFrameGroupBy.max([numeric_only, ...]) Compute max of group values.
DataFrameGroupBy.mean([numeric_only, ...]) Compute mean of groups, excluding missing values.
DataFrameGroupBy.median([numeric_only, skipna]) Compute median of groups, excluding missing values.
DataFrameGroupBy.min([numeric_only, ...]) Compute min of group values.
DataFrameGroupBy.ngroup([ascending]) Number each group from 0 to the number of groups - 1.
DataFrameGroupBy.nth Take the nth row from each group if n is an int, otherwise a subset of rows.
DataFrameGroupBy.nunique([dropna]) Return DataFrame with counts of unique elements in each position.
DataFrameGroupBy.ohlc() Compute open, high, low and close values of a group, excluding missing values.
DataFrameGroupBy.pct_change([periods, ...]) Calculate pct_change of each value to previous entry in group.
DataFrameGroupBy.prod([numeric_only, ...]) Compute prod of group values.
DataFrameGroupBy.quantile([q, ...]) Return group values at the given quantile, a la numpy.percentile.
DataFrameGroupBy.rank([method, ascending, ...]) Provide the rank of values within each group.
DataFrameGroupBy.resample(rule, *args[, ...]) Provide resampling when using a TimeGrouper.
DataFrameGroupBy.rolling(window[, ...]) Return a rolling grouper, providing rolling functionality per group.
DataFrameGroupBy.sample([n, frac, replace, ...]) Return a random sample of items from each group.
DataFrameGroupBy.sem([ddof, numeric_only, ...]) Compute standard error of the mean of groups, excluding missing values.
DataFrameGroupBy.shift([periods, freq, ...]) Shift each group by periods observations.
DataFrameGroupBy.size() Compute group sizes.
DataFrameGroupBy.skew([skipna, numeric_only]) Return unbiased skew within groups.
DataFrameGroupBy.kurt([skipna, numeric_only]) Return unbiased kurtosis within groups.
DataFrameGroupBy.std([ddof, engine, ...]) Compute standard deviation of groups, excluding missing values.
DataFrameGroupBy.sum([numeric_only, ...]) Compute sum of group values.
DataFrameGroupBy.var([ddof, engine, ...]) Compute variance of groups, excluding missing values.
DataFrameGroupBy.tail([n]) Return last n rows of each group.
DataFrameGroupBy.take(indices, **kwargs) Return the elements in the given positional indices in each group.
DataFrameGroupBy.value_counts([subset, ...]) Return a Series or DataFrame containing counts of unique rows.

SeriesGroupBy computations / descriptive stats#

SeriesGroupBy.all([skipna]) Return True if all values in the group are truthful, else False.
SeriesGroupBy.any([skipna]) Return True if any value in the group is truthful, else False.
SeriesGroupBy.bfill([limit]) Backward fill the values.
SeriesGroupBy.corr(other[, method, min_periods]) Compute correlation with other Series, excluding missing values.
SeriesGroupBy.count() Compute count of group, excluding missing values.
SeriesGroupBy.cov(other[, min_periods, ddof]) Compute covariance with Series, excluding missing values.
SeriesGroupBy.cumcount([ascending]) Number each item in each group from 0 to the length of that group - 1.
SeriesGroupBy.cummax([numeric_only]) Cumulative max for each group.
SeriesGroupBy.cummin([numeric_only]) Cumulative min for each group.
SeriesGroupBy.cumprod([numeric_only]) Cumulative product for each group.
SeriesGroupBy.cumsum([numeric_only]) Cumulative sum for each group.
SeriesGroupBy.describe([percentiles, ...]) Generate descriptive statistics.
SeriesGroupBy.diff([periods]) First discrete difference of element.
SeriesGroupBy.ewm([com, span, halflife, ...]) Return an ewm grouper, providing ewm functionality per group.
SeriesGroupBy.expanding([min_periods, method]) Return an expanding grouper, providing expanding functionality per group.
SeriesGroupBy.ffill([limit]) Forward fill the values.
SeriesGroupBy.first([numeric_only, ...]) Compute the first entry of each column within each group.
SeriesGroupBy.head([n]) Return first n rows of each group.
SeriesGroupBy.last([numeric_only, ...]) Compute the last entry of each column within each group.
SeriesGroupBy.idxmax([skipna]) Return the row label of the maximum value.
SeriesGroupBy.idxmin([skipna]) Return the row label of the minimum value.
SeriesGroupBy.is_monotonic_increasing Return whether each group's values are monotonically increasing.
SeriesGroupBy.is_monotonic_decreasing Return whether each group's values are monotonically decreasing.
SeriesGroupBy.max([numeric_only, min_count, ...]) Compute max of group values.
SeriesGroupBy.mean([numeric_only, skipna, ...]) Compute mean of groups, excluding missing values.
SeriesGroupBy.median([numeric_only, skipna]) Compute median of groups, excluding missing values.
SeriesGroupBy.min([numeric_only, min_count, ...]) Compute min of group values.
SeriesGroupBy.ngroup([ascending]) Number each group from 0 to the number of groups - 1.
SeriesGroupBy.nlargest([n, keep]) Return the largest n elements.
SeriesGroupBy.nsmallest([n, keep]) Return the smallest n elements.
SeriesGroupBy.nth Take the nth row from each group if n is an int, otherwise a subset of rows.
SeriesGroupBy.nunique([dropna]) Return number of unique elements in the group.
SeriesGroupBy.unique() Return unique values for each group.
SeriesGroupBy.ohlc() Compute open, high, low and close values of a group, excluding missing values.
SeriesGroupBy.pct_change([periods, ...]) Calculate pct_change of each value to previous entry in group.
SeriesGroupBy.prod([numeric_only, ...]) Compute prod of group values.
SeriesGroupBy.quantile([q, interpolation, ...]) Return group values at the given quantile, a la numpy.percentile.
SeriesGroupBy.rank([method, ascending, ...]) Provide the rank of values within each group.
SeriesGroupBy.resample(rule, *args[, ...]) Provide resampling when using a TimeGrouper.
SeriesGroupBy.rolling(window[, min_periods, ...]) Return a rolling grouper, providing rolling functionality per group.
SeriesGroupBy.sample([n, frac, replace, ...]) Return a random sample of items from each group.
SeriesGroupBy.sem([ddof, numeric_only, skipna]) Compute standard error of the mean of groups, excluding missing values.
SeriesGroupBy.shift([periods, freq, ...]) Shift each group by periods observations.
SeriesGroupBy.size() Compute group sizes.
SeriesGroupBy.skew([skipna, numeric_only]) Return unbiased skew within groups.
SeriesGroupBy.kurt([skipna, numeric_only]) Return unbiased kurtosis within groups.
SeriesGroupBy.std([ddof, engine, ...]) Compute standard deviation of groups, excluding missing values.
SeriesGroupBy.sum([numeric_only, min_count, ...]) Compute sum of group values.
SeriesGroupBy.var([ddof, engine, ...]) Compute variance of groups, excluding missing values.
SeriesGroupBy.tail([n]) Return last n rows of each group.
SeriesGroupBy.take(indices, **kwargs) Return the elements in the given positional indices in each group.
SeriesGroupBy.value_counts([normalize, ...]) Return a Series or DataFrame containing counts of unique rows.

Plotting and visualization#