compute_class_weight (original) (raw)
sklearn.utils.class_weight.compute_class_weight(class_weight, *, classes, y, sample_weight=None)[source]#
Estimate class weights for unbalanced datasets.
Parameters:
class_weightdict, “balanced” or None
If “balanced”, class weights will be given byn_samples / (n_classes * np.bincount(y))
or their weighted equivalent ifsample_weight
is provided. If a dictionary is given, keys are classes and values are corresponding class weights. If None
is given, the class weights will be uniform.
classesndarray
Array of the classes occurring in the data, as given bynp.unique(y_org)
with y_org
the original class labels.
yarray-like of shape (n_samples,)
Array of original class labels per sample.
sample_weightarray-like of shape (n_samples,), default=None
Array of weights that are assigned to individual samples. Only used whenclass_weight='balanced'
.
Returns:
class_weight_vectndarray of shape (n_classes,)
Array with class_weight_vect[i]
the weight for i-th class.
References
The “balanced” heuristic is inspired by Logistic Regression in Rare Events Data, King, Zen, 2001.
Examples
import numpy as np from sklearn.utils.class_weight import compute_class_weight y = [1, 1, 1, 1, 0, 0] compute_class_weight(class_weight="balanced", classes=np.unique(y), y=y) array([1.5 , 0.75])