fowlkes_mallows_score (original) (raw)
sklearn.metrics.fowlkes_mallows_score(labels_true, labels_pred, *, sparse='deprecated')[source]#
Measure the similarity of two clusterings of a set of points.
Added in version 0.18.
The Fowlkes-Mallows index (FMI) is defined as the geometric mean of the precision and recall:
FMI = TP / sqrt((TP + FP) * (TP + FN))
Where TP
is the number of True Positive (i.e. the number of pairs of points that belong to the same cluster in both labels_true
andlabels_pred
), FP
is the number of False Positive (i.e. the number of pairs of points that belong to the same cluster inlabels_pred
but not in labels_true
) and FN
is the number ofFalse Negative (i.e. the number of pairs of points that belong to the same cluster in labels_true
but not in labels_pred
).
The score ranges from 0 to 1. A high value indicates a good similarity between two clusters.
Read more in the User Guide.
Parameters:
labels_truearray-like of shape (n_samples,), dtype=int
A clustering of the data into disjoint subsets.
labels_predarray-like of shape (n_samples,), dtype=int
A clustering of the data into disjoint subsets.
sparsebool, default=False
Compute contingency matrix internally with sparse matrix.
Deprecated since version 1.7: The sparse
parameter is deprecated and will be removed in 1.9. It has no effect.
Returns:
scorefloat
The resulting Fowlkes-Mallows score.
References
Examples
Perfect labelings are both homogeneous and complete, hence have score 1.0:
from sklearn.metrics.cluster import fowlkes_mallows_score fowlkes_mallows_score([0, 0, 1, 1], [0, 0, 1, 1]) 1.0 fowlkes_mallows_score([0, 0, 1, 1], [1, 1, 0, 0]) 1.0
If classes members are completely split across different clusters, the assignment is totally random, hence the FMI is null:
fowlkes_mallows_score([0, 0, 0, 0], [0, 1, 2, 3]) 0.0