pandas.core.groupby.DataFrameGroupBy.corr — pandas 2.2.3 documentation (original) (raw)
DataFrameGroupBy.corr(method='pearson', min_periods=1, numeric_only=False)[source]#
Compute pairwise correlation of columns, excluding NA/null values.
Parameters:
method{‘pearson’, ‘kendall’, ‘spearman’} or callable
Method of correlation:
- pearson : standard correlation coefficient
- kendall : Kendall Tau correlation coefficient
- spearman : Spearman rank correlation
- callable: callable with input two 1d ndarrays
and returning a float. Note that the returned matrix from corr will have 1 along the diagonals and will be symmetric regardless of the callable’s behavior.
min_periodsint, optional
Minimum number of observations required per pair of columns to have a valid result. Currently only available for Pearson and Spearman correlation.
numeric_onlybool, default False
Include only float, int or boolean data.
Added in version 1.5.0.
Changed in version 2.0.0: The default value of numeric_only
is now False
.
Returns:
DataFrame
Correlation matrix.
See also
DataFrame.corrwith
Compute pairwise correlation with another DataFrame or Series.
Series.corr
Compute the correlation between two Series.
Notes
Pearson, Kendall and Spearman correlation are currently computed using pairwise complete observations.
- Pearson correlation coefficient
- Kendall rank correlation coefficient
- Spearman’s rank correlation coefficient
Examples
def histogram_intersection(a, b): ... v = np.minimum(a, b).sum().round(decimals=1) ... return v df = pd.DataFrame([(.2, .3), (.0, .6), (.6, .0), (.2, .1)], ... columns=['dogs', 'cats']) df.corr(method=histogram_intersection) dogs cats dogs 1.0 0.3 cats 0.3 1.0
df = pd.DataFrame([(1, 1), (2, np.nan), (np.nan, 3), (4, 4)], ... columns=['dogs', 'cats']) df.corr(min_periods=3) dogs cats dogs 1.0 NaN cats NaN 1.0