NaiveBayes (Spark 3.5.5 JavaDoc) (original) (raw)
Object
- org.apache.spark.ml.PipelineStage
- org.apache.spark.ml.Estimator
- org.apache.spark.ml.Predictor<FeaturesType,E,M>
* * org.apache.spark.ml.classification.Classifier<FeaturesType,E,M>
* * org.apache.spark.ml.classification.ProbabilisticClassifier<Vector,NaiveBayes,NaiveBayesModel>
* * org.apache.spark.ml.classification.NaiveBayes
- org.apache.spark.ml.Predictor<FeaturesType,E,M>
- org.apache.spark.ml.Estimator
All Implemented Interfaces:
java.io.Serializable, org.apache.spark.internal.Logging, ClassifierParams, NaiveBayesParams, ProbabilisticClassifierParams, Params, HasFeaturesCol, HasLabelCol, HasPredictionCol, HasProbabilityCol, HasRawPredictionCol, HasThresholds, HasWeightCol, PredictorParams, DefaultParamsWritable, Identifiable, MLWritable
public class NaiveBayes
extends ProbabilisticClassifier<Vector,NaiveBayes,NaiveBayesModel>
implements NaiveBayesParams, DefaultParamsWritable
Naive Bayes Classifiers. It supports Multinomial NB (see here) which can handle finitely supported discrete data. For example, by converting documents into TF-IDF vectors, it can be used for document classification. By making every vector a binary (0/1) data, it can also be used as Bernoulli NB (see here). The input feature values for Multinomial NB and Bernoulli NB must be nonnegative. Since 3.0.0, it supports Complement NB which is an adaptation of the Multinomial NB. Specifically, Complement NB uses statistics from the complement of each class to compute the model's coefficients The inventors of Complement NB show empirically that the parameter estimates for CNB are more stable than those for Multinomial NB. Like Multinomial NB, the input feature values for Complement NB must be nonnegative. Since 3.0.0, it also supports Gaussian NB (see here) which can handle continuous data.
See Also:
Serialized Form
Nested Class Summary
* ### Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging `org.apache.spark.internal.Logging.SparkShellLoggingFilter`
Constructor Summary
Constructors
Constructor and Description NaiveBayes() NaiveBayes(String uid) Method Summary
All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type Method and Description NaiveBayes copy(ParamMap extra) Creates a copy of this instance with the same UID and some extra params. static NaiveBayes load(String path) Param modelType() The model type which is a string (case-sensitive). static MLReader read() NaiveBayes setModelType(String value) Set the model type using a string (case-sensitive). NaiveBayes setSmoothing(double value) Set the smoothing parameter. NaiveBayes setWeightCol(String value) Sets the value of param weightCol. DoubleParam smoothing() The smoothing parameter. String uid() An immutable unique ID for the object and its derivatives. Param weightCol() Param for weight column name. * ### Methods inherited from class org.apache.spark.ml.classification.[ProbabilisticClassifier](../../../../../org/apache/spark/ml/classification/ProbabilisticClassifier.html "class in org.apache.spark.ml.classification") `[probabilityCol](../../../../../org/apache/spark/ml/classification/ProbabilisticClassifier.html#probabilityCol--), [setProbabilityCol](../../../../../org/apache/spark/ml/classification/ProbabilisticClassifier.html#setProbabilityCol-java.lang.String-), [setThresholds](../../../../../org/apache/spark/ml/classification/ProbabilisticClassifier.html#setThresholds-double:A-), [thresholds](../../../../../org/apache/spark/ml/classification/ProbabilisticClassifier.html#thresholds--)` * ### Methods inherited from class org.apache.spark.ml.classification.[Classifier](../../../../../org/apache/spark/ml/classification/Classifier.html "class in org.apache.spark.ml.classification") `[rawPredictionCol](../../../../../org/apache/spark/ml/classification/Classifier.html#rawPredictionCol--), [setRawPredictionCol](../../../../../org/apache/spark/ml/classification/Classifier.html#setRawPredictionCol-java.lang.String-)` * ### Methods inherited from class org.apache.spark.ml.[Predictor](../../../../../org/apache/spark/ml/Predictor.html "class in org.apache.spark.ml") `[featuresCol](../../../../../org/apache/spark/ml/Predictor.html#featuresCol--), [fit](../../../../../org/apache/spark/ml/Predictor.html#fit-org.apache.spark.sql.Dataset-), [labelCol](../../../../../org/apache/spark/ml/Predictor.html#labelCol--), [predictionCol](../../../../../org/apache/spark/ml/Predictor.html#predictionCol--), [setFeaturesCol](../../../../../org/apache/spark/ml/Predictor.html#setFeaturesCol-java.lang.String-), [setLabelCol](../../../../../org/apache/spark/ml/Predictor.html#setLabelCol-java.lang.String-), [setPredictionCol](../../../../../org/apache/spark/ml/Predictor.html#setPredictionCol-java.lang.String-), [transformSchema](../../../../../org/apache/spark/ml/Predictor.html#transformSchema-org.apache.spark.sql.types.StructType-)` * ### Methods inherited from class org.apache.spark.ml.[Estimator](../../../../../org/apache/spark/ml/Estimator.html "class in org.apache.spark.ml") `[fit](../../../../../org/apache/spark/ml/Estimator.html#fit-org.apache.spark.sql.Dataset-org.apache.spark.ml.param.ParamMap-), [fit](../../../../../org/apache/spark/ml/Estimator.html#fit-org.apache.spark.sql.Dataset-org.apache.spark.ml.param.ParamPair-org.apache.spark.ml.param.ParamPair...-), [fit](../../../../../org/apache/spark/ml/Estimator.html#fit-org.apache.spark.sql.Dataset-org.apache.spark.ml.param.ParamPair-scala.collection.Seq-), [fit](../../../../../org/apache/spark/ml/Estimator.html#fit-org.apache.spark.sql.Dataset-scala.collection.Seq-)` * ### Methods inherited from class org.apache.spark.ml.[PipelineStage](../../../../../org/apache/spark/ml/PipelineStage.html "class in org.apache.spark.ml") `[params](../../../../../org/apache/spark/ml/PipelineStage.html#params--)` * ### Methods inherited from class Object `equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait` * ### Methods inherited from interface org.apache.spark.ml.classification.[NaiveBayesParams](../../../../../org/apache/spark/ml/classification/NaiveBayesParams.html "interface in org.apache.spark.ml.classification") `[getModelType](../../../../../org/apache/spark/ml/classification/NaiveBayesParams.html#getModelType--), [getSmoothing](../../../../../org/apache/spark/ml/classification/NaiveBayesParams.html#getSmoothing--)` * ### Methods inherited from interface org.apache.spark.ml.[PredictorParams](../../../../../org/apache/spark/ml/PredictorParams.html "interface in org.apache.spark.ml") `[validateAndTransformSchema](../../../../../org/apache/spark/ml/PredictorParams.html#validateAndTransformSchema-org.apache.spark.sql.types.StructType-boolean-org.apache.spark.sql.types.DataType-)` * ### Methods inherited from interface org.apache.spark.ml.param.shared.[HasLabelCol](../../../../../org/apache/spark/ml/param/shared/HasLabelCol.html "interface in org.apache.spark.ml.param.shared") `[getLabelCol](../../../../../org/apache/spark/ml/param/shared/HasLabelCol.html#getLabelCol--), [labelCol](../../../../../org/apache/spark/ml/param/shared/HasLabelCol.html#labelCol--)` * ### Methods inherited from interface org.apache.spark.ml.param.shared.[HasFeaturesCol](../../../../../org/apache/spark/ml/param/shared/HasFeaturesCol.html "interface in org.apache.spark.ml.param.shared") `[featuresCol](../../../../../org/apache/spark/ml/param/shared/HasFeaturesCol.html#featuresCol--), [getFeaturesCol](../../../../../org/apache/spark/ml/param/shared/HasFeaturesCol.html#getFeaturesCol--)` * ### Methods inherited from interface org.apache.spark.ml.param.shared.[HasPredictionCol](../../../../../org/apache/spark/ml/param/shared/HasPredictionCol.html "interface in org.apache.spark.ml.param.shared") `[getPredictionCol](../../../../../org/apache/spark/ml/param/shared/HasPredictionCol.html#getPredictionCol--), [predictionCol](../../../../../org/apache/spark/ml/param/shared/HasPredictionCol.html#predictionCol--)` * ### Methods inherited from interface org.apache.spark.ml.param.[Params](../../../../../org/apache/spark/ml/param/Params.html "interface in org.apache.spark.ml.param") `[clear](../../../../../org/apache/spark/ml/param/Params.html#clear-org.apache.spark.ml.param.Param-), [copyValues](../../../../../org/apache/spark/ml/param/Params.html#copyValues-T-org.apache.spark.ml.param.ParamMap-), [defaultCopy](../../../../../org/apache/spark/ml/param/Params.html#defaultCopy-org.apache.spark.ml.param.ParamMap-), [defaultParamMap](../../../../../org/apache/spark/ml/param/Params.html#defaultParamMap--), [explainParam](../../../../../org/apache/spark/ml/param/Params.html#explainParam-org.apache.spark.ml.param.Param-), [explainParams](../../../../../org/apache/spark/ml/param/Params.html#explainParams--), [extractParamMap](../../../../../org/apache/spark/ml/param/Params.html#extractParamMap--), [extractParamMap](../../../../../org/apache/spark/ml/param/Params.html#extractParamMap-org.apache.spark.ml.param.ParamMap-), [get](../../../../../org/apache/spark/ml/param/Params.html#get-org.apache.spark.ml.param.Param-), [getDefault](../../../../../org/apache/spark/ml/param/Params.html#getDefault-org.apache.spark.ml.param.Param-), [getOrDefault](../../../../../org/apache/spark/ml/param/Params.html#getOrDefault-org.apache.spark.ml.param.Param-), [getParam](../../../../../org/apache/spark/ml/param/Params.html#getParam-java.lang.String-), [hasDefault](../../../../../org/apache/spark/ml/param/Params.html#hasDefault-org.apache.spark.ml.param.Param-), [hasParam](../../../../../org/apache/spark/ml/param/Params.html#hasParam-java.lang.String-), [isDefined](../../../../../org/apache/spark/ml/param/Params.html#isDefined-org.apache.spark.ml.param.Param-), [isSet](../../../../../org/apache/spark/ml/param/Params.html#isSet-org.apache.spark.ml.param.Param-), [onParamChange](../../../../../org/apache/spark/ml/param/Params.html#onParamChange-org.apache.spark.ml.param.Param-), [paramMap](../../../../../org/apache/spark/ml/param/Params.html#paramMap--), [params](../../../../../org/apache/spark/ml/param/Params.html#params--), [set](../../../../../org/apache/spark/ml/param/Params.html#set-org.apache.spark.ml.param.Param-T-), [set](../../../../../org/apache/spark/ml/param/Params.html#set-org.apache.spark.ml.param.ParamPair-), [set](../../../../../org/apache/spark/ml/param/Params.html#set-java.lang.String-java.lang.Object-), [setDefault](../../../../../org/apache/spark/ml/param/Params.html#setDefault-org.apache.spark.ml.param.Param-T-), [setDefault](../../../../../org/apache/spark/ml/param/Params.html#setDefault-scala.collection.Seq-), [shouldOwn](../../../../../org/apache/spark/ml/param/Params.html#shouldOwn-org.apache.spark.ml.param.Param-)` * ### Methods inherited from interface org.apache.spark.ml.util.[Identifiable](../../../../../org/apache/spark/ml/util/Identifiable.html "interface in org.apache.spark.ml.util") `[toString](../../../../../org/apache/spark/ml/util/Identifiable.html#toString--)` * ### Methods inherited from interface org.apache.spark.ml.param.shared.[HasWeightCol](../../../../../org/apache/spark/ml/param/shared/HasWeightCol.html "interface in org.apache.spark.ml.param.shared") `[getWeightCol](../../../../../org/apache/spark/ml/param/shared/HasWeightCol.html#getWeightCol--)` * ### Methods inherited from interface org.apache.spark.ml.util.[DefaultParamsWritable](../../../../../org/apache/spark/ml/util/DefaultParamsWritable.html "interface in org.apache.spark.ml.util") `[write](../../../../../org/apache/spark/ml/util/DefaultParamsWritable.html#write--)` * ### Methods inherited from interface org.apache.spark.ml.util.[MLWritable](../../../../../org/apache/spark/ml/util/MLWritable.html "interface in org.apache.spark.ml.util") `[save](../../../../../org/apache/spark/ml/util/MLWritable.html#save-java.lang.String-)` * ### Methods inherited from interface org.apache.spark.ml.classification.[ProbabilisticClassifierParams](../../../../../org/apache/spark/ml/classification/ProbabilisticClassifierParams.html "interface in org.apache.spark.ml.classification") `[validateAndTransformSchema](../../../../../org/apache/spark/ml/classification/ProbabilisticClassifierParams.html#validateAndTransformSchema-org.apache.spark.sql.types.StructType-boolean-org.apache.spark.sql.types.DataType-)` * ### Methods inherited from interface org.apache.spark.ml.param.shared.[HasRawPredictionCol](../../../../../org/apache/spark/ml/param/shared/HasRawPredictionCol.html "interface in org.apache.spark.ml.param.shared") `[getRawPredictionCol](../../../../../org/apache/spark/ml/param/shared/HasRawPredictionCol.html#getRawPredictionCol--), [rawPredictionCol](../../../../../org/apache/spark/ml/param/shared/HasRawPredictionCol.html#rawPredictionCol--)` * ### Methods inherited from interface org.apache.spark.ml.param.shared.[HasProbabilityCol](../../../../../org/apache/spark/ml/param/shared/HasProbabilityCol.html "interface in org.apache.spark.ml.param.shared") `[getProbabilityCol](../../../../../org/apache/spark/ml/param/shared/HasProbabilityCol.html#getProbabilityCol--)` * ### Methods inherited from interface org.apache.spark.ml.param.shared.[HasThresholds](../../../../../org/apache/spark/ml/param/shared/HasThresholds.html "interface in org.apache.spark.ml.param.shared") `[getThresholds](../../../../../org/apache/spark/ml/param/shared/HasThresholds.html#getThresholds--)` * ### Methods inherited from interface org.apache.spark.internal.Logging `$init$, initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, initLock, isTraceEnabled, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning, org$apache$spark$internal$Logging$$log__$eq, org$apache$spark$internal$Logging$$log_, uninitialize`
Constructor Detail
* #### NaiveBayes public NaiveBayes(String uid) * #### NaiveBayes public NaiveBayes()
Method Detail
* #### load public static [NaiveBayes](../../../../../org/apache/spark/ml/classification/NaiveBayes.html "class in org.apache.spark.ml.classification") load(String path) * #### read public static [MLReader](../../../../../org/apache/spark/ml/util/MLReader.html "class in org.apache.spark.ml.util")<T> read() * #### smoothing public final [DoubleParam](../../../../../org/apache/spark/ml/param/DoubleParam.html "class in org.apache.spark.ml.param") smoothing() The smoothing parameter. (default = 1.0). Specified by: `[smoothing](../../../../../org/apache/spark/ml/classification/NaiveBayesParams.html#smoothing--)` in interface `[NaiveBayesParams](../../../../../org/apache/spark/ml/classification/NaiveBayesParams.html "interface in org.apache.spark.ml.classification")` Returns: (undocumented) * #### modelType public final [Param](../../../../../org/apache/spark/ml/param/Param.html "class in org.apache.spark.ml.param")<String> modelType() The model type which is a string (case-sensitive). Supported options: "multinomial", "complement", "bernoulli", "gaussian". (default = multinomial) Specified by: `[modelType](../../../../../org/apache/spark/ml/classification/NaiveBayesParams.html#modelType--)` in interface `[NaiveBayesParams](../../../../../org/apache/spark/ml/classification/NaiveBayesParams.html "interface in org.apache.spark.ml.classification")` Returns: (undocumented) * #### weightCol public final [Param](../../../../../org/apache/spark/ml/param/Param.html "class in org.apache.spark.ml.param")<String> weightCol() Param for weight column name. If this is not set or empty, we treat all instance weights as 1.0. Specified by: `[weightCol](../../../../../org/apache/spark/ml/param/shared/HasWeightCol.html#weightCol--)` in interface `[HasWeightCol](../../../../../org/apache/spark/ml/param/shared/HasWeightCol.html "interface in org.apache.spark.ml.param.shared")` Returns: (undocumented) * #### uid public String uid() An immutable unique ID for the object and its derivatives. Specified by: `[uid](../../../../../org/apache/spark/ml/util/Identifiable.html#uid--)` in interface `[Identifiable](../../../../../org/apache/spark/ml/util/Identifiable.html "interface in org.apache.spark.ml.util")` Returns: (undocumented) * #### setSmoothing public [NaiveBayes](../../../../../org/apache/spark/ml/classification/NaiveBayes.html "class in org.apache.spark.ml.classification") setSmoothing(double value) Set the smoothing parameter. Default is 1.0. Parameters: `value` \- (undocumented) Returns: (undocumented) * #### setModelType public [NaiveBayes](../../../../../org/apache/spark/ml/classification/NaiveBayes.html "class in org.apache.spark.ml.classification") setModelType(String value) Set the model type using a string (case-sensitive). Supported options: "multinomial", "complement", "bernoulli", and "gaussian". Default is "multinomial" Parameters: `value` \- (undocumented) Returns: (undocumented) * #### setWeightCol public [NaiveBayes](../../../../../org/apache/spark/ml/classification/NaiveBayes.html "class in org.apache.spark.ml.classification") setWeightCol(String value) Sets the value of param `weightCol`. If this is not set or empty, we treat all instance weights as 1.0\. Default is not set, so all instances have weight one. Parameters: `value` \- (undocumented) Returns: (undocumented) * #### copy public [NaiveBayes](../../../../../org/apache/spark/ml/classification/NaiveBayes.html "class in org.apache.spark.ml.classification") copy([ParamMap](../../../../../org/apache/spark/ml/param/ParamMap.html "class in org.apache.spark.ml.param") extra) Description copied from interface: `[Params](../../../../../org/apache/spark/ml/param/Params.html#copy-org.apache.spark.ml.param.ParamMap-)` Creates a copy of this instance with the same UID and some extra params. Subclasses should implement this method and set the return type properly. See `defaultCopy()`. Specified by: `[copy](../../../../../org/apache/spark/ml/param/Params.html#copy-org.apache.spark.ml.param.ParamMap-)` in interface `[Params](../../../../../org/apache/spark/ml/param/Params.html "interface in org.apache.spark.ml.param")` Specified by: `[copy](../../../../../org/apache/spark/ml/Predictor.html#copy-org.apache.spark.ml.param.ParamMap-)` in class `[Predictor](../../../../../org/apache/spark/ml/Predictor.html "class in org.apache.spark.ml")<[Vector](../../../../../org/apache/spark/ml/linalg/Vector.html "interface in org.apache.spark.ml.linalg"),[NaiveBayes](../../../../../org/apache/spark/ml/classification/NaiveBayes.html "class in org.apache.spark.ml.classification"),[NaiveBayesModel](../../../../../org/apache/spark/ml/classification/NaiveBayesModel.html "class in org.apache.spark.ml.classification")>` Parameters: `extra` \- (undocumented) Returns: (undocumented)