pandas.core.groupby.DataFrameGroupBy.corr — pandas 2.2.3 documentation (original) (raw)

DataFrameGroupBy.corr(method='pearson', min_periods=1, numeric_only=False)[source]#

Compute pairwise correlation of columns, excluding NA/null values.

Parameters:

method{‘pearson’, ‘kendall’, ‘spearman’} or callable

Method of correlation:

min_periodsint, optional

Minimum number of observations required per pair of columns to have a valid result. Currently only available for Pearson and Spearman correlation.

numeric_onlybool, default False

Include only float, int or boolean data.

Added in version 1.5.0.

Changed in version 2.0.0: The default value of numeric_only is now False.

Returns:

DataFrame

Correlation matrix.

See also

DataFrame.corrwith

Compute pairwise correlation with another DataFrame or Series.

Series.corr

Compute the correlation between two Series.

Notes

Pearson, Kendall and Spearman correlation are currently computed using pairwise complete observations.

Examples

def histogram_intersection(a, b): ... v = np.minimum(a, b).sum().round(decimals=1) ... return v df = pd.DataFrame([(.2, .3), (.0, .6), (.6, .0), (.2, .1)], ... columns=['dogs', 'cats']) df.corr(method=histogram_intersection) dogs cats dogs 1.0 0.3 cats 0.3 1.0

df = pd.DataFrame([(1, 1), (2, np.nan), (np.nan, 3), (4, 4)], ... columns=['dogs', 'cats']) df.corr(min_periods=3) dogs cats dogs 1.0 NaN cats NaN 1.0