estimate_bandwidth (original) (raw)

sklearn.cluster.estimate_bandwidth(X, *, quantile=0.3, n_samples=None, random_state=0, n_jobs=None)[source]#

Estimate the bandwidth to use with the mean-shift algorithm.

This function takes time at least quadratic in n_samples. For large datasets, it is wise to subsample by setting n_samples. Alternatively, the parameter bandwidth can be set to a small value without estimating it.

Parameters:

Xarray-like of shape (n_samples, n_features)

Input points.

quantilefloat, default=0.3

Should be between [0, 1] 0.5 means that the median of all pairwise distances is used.

n_samplesint, default=None

The number of samples to use. If not given, all samples are used.

random_stateint, RandomState instance, default=None

The generator used to randomly select the samples from input points for bandwidth estimation. Use an int to make the randomness deterministic. See Glossary.

n_jobsint, default=None

The number of parallel jobs to run for neighbors search.None means 1 unless in a joblib.parallel_backend context.-1 means using all processors. See Glossaryfor more details.

Returns:

bandwidthfloat

The bandwidth parameter.

Examples

import numpy as np from sklearn.cluster import estimate_bandwidth X = np.array([[1, 1], [2, 1], [1, 0], ... [4, 7], [3, 5], [3, 6]]) estimate_bandwidth(X, quantile=0.5) np.float64(1.61)

estimate_bandwidth (original) (raw)

Gallery examples#