BisectingKMeansModel (Spark 3.5.5 JavaDoc) (original) (raw)
Object
- org.apache.spark.mllib.clustering.BisectingKMeansModel
All Implemented Interfaces:
java.io.Serializable, org.apache.spark.internal.Logging, Saveable
public class BisectingKMeansModel
extends Object
implements scala.Serializable, Saveable, org.apache.spark.internal.Logging
Clustering model produced by BisectingKMeans. The prediction is done level-by-level from the root node to a leaf node, and at each node among its children the closest to the input point is selected.
param: root the root node of the clustering tree
See Also:
Serialized Form
Nested Class Summary
Nested Classes
Modifier and Type Class and Description static class BisectingKMeansModel.SaveLoadV1_0$ static class BisectingKMeansModel.SaveLoadV2_0$ static class BisectingKMeansModel.SaveLoadV3_0$ * ### Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging `org.apache.spark.internal.Logging.SparkShellLoggingFilter`
Constructor Summary
Constructors
Constructor and Description BisectingKMeansModel(org.apache.spark.mllib.clustering.ClusteringTreeNode root) Method Summary
All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type Method and Description Vector[] clusterCenters() Leaf cluster centers. double computeCost(JavaRDD<Vector> data) Java-friendly version of computeCost(). double computeCost(RDD<Vector> data) Computes the sum of squared distances between the input points and their corresponding cluster centers. double computeCost(Vector point) Computes the squared distance between the input point and the cluster center it belongs to. String distanceMeasure() int k() static BisectingKMeansModel load(SparkContext sc, String path) JavaRDD predict(JavaRDD<Vector> points) Java-friendly version of predict(). RDD predict(RDD<Vector> points) Predicts the indices of the clusters that the input points belong to. int predict(Vector point) Predicts the index of the cluster that the input point belongs to. void save(SparkContext sc, String path) Save this model to the given path. double trainingCost() * ### Methods inherited from class Object `equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait` * ### Methods inherited from interface org.apache.spark.internal.Logging `$init$, initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, initLock, isTraceEnabled, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning, org$apache$spark$internal$Logging$$log__$eq, org$apache$spark$internal$Logging$$log_, uninitialize`
Constructor Detail
* #### BisectingKMeansModel public BisectingKMeansModel(org.apache.spark.mllib.clustering.ClusteringTreeNode root)
Method Detail
* #### load public static [BisectingKMeansModel](../../../../../org/apache/spark/mllib/clustering/BisectingKMeansModel.html "class in org.apache.spark.mllib.clustering") load([SparkContext](../../../../../org/apache/spark/SparkContext.html "class in org.apache.spark") sc, String path) * #### distanceMeasure public String distanceMeasure() * #### trainingCost public double trainingCost() * #### clusterCenters public [Vector](../../../../../org/apache/spark/mllib/linalg/Vector.html "interface in org.apache.spark.mllib.linalg")[] clusterCenters() Leaf cluster centers. Returns: (undocumented) * #### k public int k() * #### predict public int predict([Vector](../../../../../org/apache/spark/mllib/linalg/Vector.html "interface in org.apache.spark.mllib.linalg") point) Predicts the index of the cluster that the input point belongs to. Parameters: `point` \- (undocumented) Returns: (undocumented) * #### predict public [RDD](../../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<Object> predict([RDD](../../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[Vector](../../../../../org/apache/spark/mllib/linalg/Vector.html "interface in org.apache.spark.mllib.linalg")> points) Predicts the indices of the clusters that the input points belong to. Parameters: `points` \- (undocumented) Returns: (undocumented) * #### predict public [JavaRDD](../../../../../org/apache/spark/api/java/JavaRDD.html "class in org.apache.spark.api.java")<Integer> predict([JavaRDD](../../../../../org/apache/spark/api/java/JavaRDD.html "class in org.apache.spark.api.java")<[Vector](../../../../../org/apache/spark/mllib/linalg/Vector.html "interface in org.apache.spark.mllib.linalg")> points) Java-friendly version of `predict()`. Parameters: `points` \- (undocumented) Returns: (undocumented) * #### computeCost public double computeCost([Vector](../../../../../org/apache/spark/mllib/linalg/Vector.html "interface in org.apache.spark.mllib.linalg") point) Computes the squared distance between the input point and the cluster center it belongs to. Parameters: `point` \- (undocumented) Returns: (undocumented) * #### computeCost public double computeCost([RDD](../../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[Vector](../../../../../org/apache/spark/mllib/linalg/Vector.html "interface in org.apache.spark.mllib.linalg")> data) Computes the sum of squared distances between the input points and their corresponding cluster centers. Parameters: `data` \- (undocumented) Returns: (undocumented) * #### computeCost public double computeCost([JavaRDD](../../../../../org/apache/spark/api/java/JavaRDD.html "class in org.apache.spark.api.java")<[Vector](../../../../../org/apache/spark/mllib/linalg/Vector.html "interface in org.apache.spark.mllib.linalg")> data) Java-friendly version of `computeCost()`. Parameters: `data` \- (undocumented) Returns: (undocumented) * #### save public void save([SparkContext](../../../../../org/apache/spark/SparkContext.html "class in org.apache.spark") sc, String path) Description copied from interface: `[Saveable](../../../../../org/apache/spark/mllib/util/Saveable.html#save-org.apache.spark.SparkContext-java.lang.String-)` Save this model to the given path. This saves: - human-readable (JSON) model metadata to path/metadata/ - Parquet formatted data to path/data/ The model may be loaded using `Loader.load`. Specified by: `[save](../../../../../org/apache/spark/mllib/util/Saveable.html#save-org.apache.spark.SparkContext-java.lang.String-)` in interface `[Saveable](../../../../../org/apache/spark/mllib/util/Saveable.html "interface in org.apache.spark.mllib.util")` Parameters: `sc` \- Spark context used to save model data. `path` \- Path specifying the directory in which to save this model. If the directory already exists, this method throws an exception.