contingency_matrix (original) (raw)
sklearn.metrics.cluster.contingency_matrix(labels_true, labels_pred, *, eps=None, sparse=False, dtype=<class 'numpy.int64'>)[source]#
Build a contingency matrix describing the relationship between labels.
Read more in the User Guide.
Parameters:
labels_truearray-like of shape (n_samples,)
Ground truth class labels to be used as a reference.
labels_predarray-like of shape (n_samples,)
Cluster labels to evaluate.
epsfloat, default=None
If a float, that value is added to all values in the contingency matrix. This helps to stop NaN propagation. If None
, nothing is adjusted.
sparsebool, default=False
If True
, return a sparse CSR contingency matrix. If eps
is notNone
and sparse
is True
will raise ValueError.
Added in version 0.18.
dtypenumeric type, default=np.int64
Output dtype. Ignored if eps
is not None
.
Added in version 0.24.
Returns:
contingency{array-like, sparse}, shape=[n_classes_true, n_classes_pred]
Matrix \(C\) such that \(C_{i, j}\) is the number of samples in true class \(i\) and in predicted class \(j\). Ifeps is None
, the dtype of this array will be integer unless set otherwise with the dtype
argument. If eps
is given, the dtype will be float. Will be a sklearn.sparse.csr_matrix
if sparse=True
.
Examples
from sklearn.metrics.cluster import contingency_matrix labels_true = [0, 0, 1, 1, 2, 2] labels_pred = [1, 0, 2, 1, 0, 2] contingency_matrix(labels_true, labels_pred) array([[1, 1, 0], [0, 1, 1], [1, 0, 1]])