KMeansModel (Spark 3.5.5 JavaDoc) (original) (raw)
Object
- org.apache.spark.mllib.clustering.KMeansModel
All Implemented Interfaces:
java.io.Serializable, PMMLExportable, Saveable
Direct Known Subclasses:
StreamingKMeansModel
public class KMeansModel
extends Object
implements Saveable, scala.Serializable, PMMLExportable
A clustering model for K-means. Each point belongs to the cluster with the closest center.
See Also:
Serialized Form
Nested Class Summary
Nested Classes
Modifier and Type Class and Description static class KMeansModel.Cluster$ static class KMeansModel.SaveLoadV1_0$ static class KMeansModel.SaveLoadV2_0$ Constructor Summary
Constructors
Constructor and Description KMeansModel(Iterable<Vector> centers) A Java-friendly constructor that takes an Iterable of Vectors. KMeansModel(Vector[] clusterCenters) KMeansModel(Vector[] clusterCenters, String distanceMeasure, double trainingCost, int numIter) Method Summary
All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type Method and Description Vector[] clusterCenters() double computeCost(RDD<Vector> data) Return the K-means cost (sum of squared distances of points to their nearest center) for this model on the given data. String distanceMeasure() int k() Total number of clusters. static KMeansModel load(SparkContext sc, String path) JavaRDD predict(JavaRDD<Vector> points) Maps given points to their cluster indices. RDD predict(RDD<Vector> points) Maps given points to their cluster indices. int predict(Vector point) Returns the cluster index that a given point belongs to. void save(SparkContext sc, String path) Save this model to the given path. double trainingCost() * ### Methods inherited from class Object `equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait` * ### Methods inherited from interface org.apache.spark.mllib.pmml.[PMMLExportable](../../../../../org/apache/spark/mllib/pmml/PMMLExportable.html "interface in org.apache.spark.mllib.pmml") `[toPMML](../../../../../org/apache/spark/mllib/pmml/PMMLExportable.html#toPMML--), [toPMML](../../../../../org/apache/spark/mllib/pmml/PMMLExportable.html#toPMML-java.io.OutputStream-), [toPMML](../../../../../org/apache/spark/mllib/pmml/PMMLExportable.html#toPMML-org.apache.spark.SparkContext-java.lang.String-), [toPMML](../../../../../org/apache/spark/mllib/pmml/PMMLExportable.html#toPMML-javax.xml.transform.stream.StreamResult-), [toPMML](../../../../../org/apache/spark/mllib/pmml/PMMLExportable.html#toPMML-java.lang.String-)`
Constructor Detail
* #### KMeansModel public KMeansModel([Vector](../../../../../org/apache/spark/mllib/linalg/Vector.html "interface in org.apache.spark.mllib.linalg")[] clusterCenters, String distanceMeasure, double trainingCost, int numIter) * #### KMeansModel public KMeansModel([Vector](../../../../../org/apache/spark/mllib/linalg/Vector.html "interface in org.apache.spark.mllib.linalg")[] clusterCenters) * #### KMeansModel public KMeansModel(Iterable<[Vector](../../../../../org/apache/spark/mllib/linalg/Vector.html "interface in org.apache.spark.mllib.linalg")> centers) A Java-friendly constructor that takes an Iterable of Vectors. Parameters: `centers` \- (undocumented)
Method Detail
* #### load public static [KMeansModel](../../../../../org/apache/spark/mllib/clustering/KMeansModel.html "class in org.apache.spark.mllib.clustering") load([SparkContext](../../../../../org/apache/spark/SparkContext.html "class in org.apache.spark") sc, String path) * #### clusterCenters public [Vector](../../../../../org/apache/spark/mllib/linalg/Vector.html "interface in org.apache.spark.mllib.linalg")[] clusterCenters() * #### distanceMeasure public String distanceMeasure() * #### trainingCost public double trainingCost() * #### k public int k() Total number of clusters. Returns: (undocumented) * #### predict public int predict([Vector](../../../../../org/apache/spark/mllib/linalg/Vector.html "interface in org.apache.spark.mllib.linalg") point) Returns the cluster index that a given point belongs to. Parameters: `point` \- (undocumented) Returns: (undocumented) * #### predict public [RDD](../../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<Object> predict([RDD](../../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[Vector](../../../../../org/apache/spark/mllib/linalg/Vector.html "interface in org.apache.spark.mllib.linalg")> points) Maps given points to their cluster indices. Parameters: `points` \- (undocumented) Returns: (undocumented) * #### predict public [JavaRDD](../../../../../org/apache/spark/api/java/JavaRDD.html "class in org.apache.spark.api.java")<Integer> predict([JavaRDD](../../../../../org/apache/spark/api/java/JavaRDD.html "class in org.apache.spark.api.java")<[Vector](../../../../../org/apache/spark/mllib/linalg/Vector.html "interface in org.apache.spark.mllib.linalg")> points) Maps given points to their cluster indices. Parameters: `points` \- (undocumented) Returns: (undocumented) * #### computeCost public double computeCost([RDD](../../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[Vector](../../../../../org/apache/spark/mllib/linalg/Vector.html "interface in org.apache.spark.mllib.linalg")> data) Return the K-means cost (sum of squared distances of points to their nearest center) for this model on the given data. Parameters: `data` \- (undocumented) Returns: (undocumented) * #### save public void save([SparkContext](../../../../../org/apache/spark/SparkContext.html "class in org.apache.spark") sc, String path) Description copied from interface: `[Saveable](../../../../../org/apache/spark/mllib/util/Saveable.html#save-org.apache.spark.SparkContext-java.lang.String-)` Save this model to the given path. This saves: - human-readable (JSON) model metadata to path/metadata/ - Parquet formatted data to path/data/ The model may be loaded using `Loader.load`. Specified by: `[save](../../../../../org/apache/spark/mllib/util/Saveable.html#save-org.apache.spark.SparkContext-java.lang.String-)` in interface `[Saveable](../../../../../org/apache/spark/mllib/util/Saveable.html "interface in org.apache.spark.mllib.util")` Parameters: `sc` \- Spark context used to save model data. `path` \- Path specifying the directory in which to save this model. If the directory already exists, this method throws an exception.