cluster_optics_xi (original) (raw)
sklearn.cluster.cluster_optics_xi(*, reachability, predecessor, ordering, min_samples, min_cluster_size=None, xi=0.05, predecessor_correction=True)[source]#
Automatically extract clusters according to the Xi-steep method.
Parameters:
reachabilityndarray of shape (n_samples,)
Reachability distances calculated by OPTICS (reachability_
).
predecessorndarray of shape (n_samples,)
Predecessors calculated by OPTICS.
orderingndarray of shape (n_samples,)
OPTICS ordered point indices (ordering_
).
min_samplesint > 1 or float between 0 and 1
The same as the min_samples given to OPTICS. Up and down steep regions can’t have more then min_samples
consecutive non-steep points. Expressed as an absolute number or a fraction of the number of samples (rounded to be at least 2).
min_cluster_sizeint > 1 or float between 0 and 1, default=None
Minimum number of samples in an OPTICS cluster, expressed as an absolute number or a fraction of the number of samples (rounded to be at least 2). If None
, the value of min_samples
is used instead.
xifloat between 0 and 1, default=0.05
Determines the minimum steepness on the reachability plot that constitutes a cluster boundary. For example, an upwards point in the reachability plot is defined by the ratio from one point to its successor being at most 1-xi.
predecessor_correctionbool, default=True
Correct clusters based on the calculated predecessors.
Returns:
labelsndarray of shape (n_samples,)
The labels assigned to samples. Points which are not included in any cluster are labeled as -1.
clustersndarray of shape (n_clusters, 2)
The list of clusters in the form of [start, end]
in each row, with all indices inclusive. The clusters are ordered according to (end, -start)
(ascending) so that larger clusters encompassing smaller clusters come after such nested smaller clusters. Since labels
does not reflect the hierarchy, usually len(clusters) > np.unique(labels)
.
Examples
import numpy as np from sklearn.cluster import cluster_optics_xi, compute_optics_graph X = np.array([[1, 2], [2, 5], [3, 6], ... [8, 7], [8, 8], [7, 3]]) ordering, core_distances, reachability, predecessor = compute_optics_graph( ... X, ... min_samples=2, ... max_eps=np.inf, ... metric="minkowski", ... p=2, ... metric_params=None, ... algorithm="auto", ... leaf_size=30, ... n_jobs=None ... ) min_samples = 2 labels, clusters = cluster_optics_xi( ... reachability=reachability, ... predecessor=predecessor, ... ordering=ordering, ... min_samples=min_samples, ... ) labels array([0, 0, 0, 1, 1, 1]) clusters array([[0, 2], [3, 5], [0, 5]])