pandas.core.groupby.DataFrameGroupBy.value_counts — pandas 3.0.0.dev0+2103.g41968a550a documentation (original) (raw)
DataFrameGroupBy.value_counts(subset=None, normalize=False, sort=True, ascending=False, dropna=True)[source]#
Return a Series or DataFrame containing counts of unique rows.
Added in version 1.4.0.
Parameters:
subsetlist-like, optional
Columns to use when counting unique combinations.
normalizebool, default False
Return proportions rather than frequencies.
sortbool, default True
Sort by frequencies when True. When False, non-grouping columns will appear in the order they occur in within groups.
Changed in version 3.0.0: In prior versions, sort=False
would sort the non-grouping columns by label.
ascendingbool, default False
Sort in ascending order.
dropnabool, default True
Don’t include counts of rows that contain NA values.
Returns:
Series or DataFrame
Series if the groupby as_index
is True, otherwise DataFrame.
See also
Series.value_counts
Equivalent method on Series.
DataFrame.value_counts
Equivalent method on DataFrame.
Equivalent method on SeriesGroupBy.
Notes
- If the groupby
as_index
is True then the returned Series will have a MultiIndex with one level per input column. - If the groupby
as_index
is False then the returned DataFrame will have an additional column with the value_counts. The column is labelled ‘count’ or ‘proportion’, depending on thenormalize
parameter.
By default, rows that contain any NA values are omitted from the result.
By default, the result will be in descending order so that the first element of each group is the most frequently-occurring row.
Examples
df = pd.DataFrame( ... { ... "gender": ["male", "male", "female", "male", "female", "male"], ... "education": ["low", "medium", "high", "low", "high", "low"], ... "country": ["US", "FR", "US", "FR", "FR", "FR"], ... } ... )
df gender education country 0 male low US 1 male medium FR 2 female high US 3 male low FR 4 female high FR 5 male low FR
df.groupby("gender").value_counts() gender education country female high US 1 FR 1 male low FR 2 US 1 medium FR 1 Name: count, dtype: int64
df.groupby("gender").value_counts(ascending=True) gender education country female high US 1 FR 1 male low US 1 medium FR 1 low FR 2 Name: count, dtype: int64
df.groupby("gender").value_counts(normalize=True) gender education country female high US 0.50 FR 0.50 male low FR 0.50 US 0.25 medium FR 0.25 Name: proportion, dtype: float64
df.groupby("gender", as_index=False).value_counts() gender education country count 0 female high US 1 1 female high FR 1 2 male low FR 2 3 male low US 1 4 male medium FR 1
df.groupby("gender", as_index=False).value_counts(normalize=True) gender education country proportion 0 female high US 0.50 1 female high FR 0.50 2 male low FR 0.50 3 male low US 0.25 4 male medium FR 0.25