Tokenizer (Spark 3.5.5 JavaDoc) (original) (raw)
Object
- org.apache.spark.ml.PipelineStage
- org.apache.spark.ml.Transformer
- org.apache.spark.ml.UnaryTransformer<String,scala.collection.Seq,Tokenizer>
* * org.apache.spark.ml.feature.Tokenizer
- org.apache.spark.ml.UnaryTransformer<String,scala.collection.Seq,Tokenizer>
- org.apache.spark.ml.Transformer
All Implemented Interfaces:
java.io.Serializable, org.apache.spark.internal.Logging, Params, HasInputCol, HasOutputCol, DefaultParamsWritable, Identifiable, MLWritable
public class Tokenizer
extends UnaryTransformer<String,scala.collection.Seq,Tokenizer>
implements DefaultParamsWritable
A tokenizer that converts the input string to lowercase and then splits it by white spaces.
See Also:
RegexTokenizer, Serialized Form
Nested Class Summary
* ### Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging `org.apache.spark.internal.Logging.SparkShellLoggingFilter`
Constructor Summary
Constructors
Constructor and Description Tokenizer() Tokenizer(String uid) Method Summary
All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type Method and Description Tokenizer copy(ParamMap extra) Creates a copy of this instance with the same UID and some extra params. static Tokenizer load(String path) static MLReader read() String uid() An immutable unique ID for the object and its derivatives. * ### Methods inherited from class org.apache.spark.ml.[UnaryTransformer](../../../../../org/apache/spark/ml/UnaryTransformer.html "class in org.apache.spark.ml") `[inputCol](../../../../../org/apache/spark/ml/UnaryTransformer.html#inputCol--), [outputCol](../../../../../org/apache/spark/ml/UnaryTransformer.html#outputCol--), [setInputCol](../../../../../org/apache/spark/ml/UnaryTransformer.html#setInputCol-java.lang.String-), [setOutputCol](../../../../../org/apache/spark/ml/UnaryTransformer.html#setOutputCol-java.lang.String-), [transform](../../../../../org/apache/spark/ml/UnaryTransformer.html#transform-org.apache.spark.sql.Dataset-), [transformSchema](../../../../../org/apache/spark/ml/UnaryTransformer.html#transformSchema-org.apache.spark.sql.types.StructType-)` * ### Methods inherited from class org.apache.spark.ml.[Transformer](../../../../../org/apache/spark/ml/Transformer.html "class in org.apache.spark.ml") `[transform](../../../../../org/apache/spark/ml/Transformer.html#transform-org.apache.spark.sql.Dataset-org.apache.spark.ml.param.ParamMap-), [transform](../../../../../org/apache/spark/ml/Transformer.html#transform-org.apache.spark.sql.Dataset-org.apache.spark.ml.param.ParamPair-org.apache.spark.ml.param.ParamPair...-), [transform](../../../../../org/apache/spark/ml/Transformer.html#transform-org.apache.spark.sql.Dataset-org.apache.spark.ml.param.ParamPair-scala.collection.Seq-)` * ### Methods inherited from class org.apache.spark.ml.[PipelineStage](../../../../../org/apache/spark/ml/PipelineStage.html "class in org.apache.spark.ml") `[params](../../../../../org/apache/spark/ml/PipelineStage.html#params--)` * ### Methods inherited from class Object `equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait` * ### Methods inherited from interface org.apache.spark.ml.util.[DefaultParamsWritable](../../../../../org/apache/spark/ml/util/DefaultParamsWritable.html "interface in org.apache.spark.ml.util") `[write](../../../../../org/apache/spark/ml/util/DefaultParamsWritable.html#write--)` * ### Methods inherited from interface org.apache.spark.ml.util.[MLWritable](../../../../../org/apache/spark/ml/util/MLWritable.html "interface in org.apache.spark.ml.util") `[save](../../../../../org/apache/spark/ml/util/MLWritable.html#save-java.lang.String-)` * ### Methods inherited from interface org.apache.spark.ml.param.shared.[HasInputCol](../../../../../org/apache/spark/ml/param/shared/HasInputCol.html "interface in org.apache.spark.ml.param.shared") `[getInputCol](../../../../../org/apache/spark/ml/param/shared/HasInputCol.html#getInputCol--)` * ### Methods inherited from interface org.apache.spark.ml.param.shared.[HasOutputCol](../../../../../org/apache/spark/ml/param/shared/HasOutputCol.html "interface in org.apache.spark.ml.param.shared") `[getOutputCol](../../../../../org/apache/spark/ml/param/shared/HasOutputCol.html#getOutputCol--)` * ### Methods inherited from interface org.apache.spark.ml.param.[Params](../../../../../org/apache/spark/ml/param/Params.html "interface in org.apache.spark.ml.param") `[clear](../../../../../org/apache/spark/ml/param/Params.html#clear-org.apache.spark.ml.param.Param-), [copyValues](../../../../../org/apache/spark/ml/param/Params.html#copyValues-T-org.apache.spark.ml.param.ParamMap-), [defaultCopy](../../../../../org/apache/spark/ml/param/Params.html#defaultCopy-org.apache.spark.ml.param.ParamMap-), [defaultParamMap](../../../../../org/apache/spark/ml/param/Params.html#defaultParamMap--), [explainParam](../../../../../org/apache/spark/ml/param/Params.html#explainParam-org.apache.spark.ml.param.Param-), [explainParams](../../../../../org/apache/spark/ml/param/Params.html#explainParams--), [extractParamMap](../../../../../org/apache/spark/ml/param/Params.html#extractParamMap--), [extractParamMap](../../../../../org/apache/spark/ml/param/Params.html#extractParamMap-org.apache.spark.ml.param.ParamMap-), [get](../../../../../org/apache/spark/ml/param/Params.html#get-org.apache.spark.ml.param.Param-), [getDefault](../../../../../org/apache/spark/ml/param/Params.html#getDefault-org.apache.spark.ml.param.Param-), [getOrDefault](../../../../../org/apache/spark/ml/param/Params.html#getOrDefault-org.apache.spark.ml.param.Param-), [getParam](../../../../../org/apache/spark/ml/param/Params.html#getParam-java.lang.String-), [hasDefault](../../../../../org/apache/spark/ml/param/Params.html#hasDefault-org.apache.spark.ml.param.Param-), [hasParam](../../../../../org/apache/spark/ml/param/Params.html#hasParam-java.lang.String-), [isDefined](../../../../../org/apache/spark/ml/param/Params.html#isDefined-org.apache.spark.ml.param.Param-), [isSet](../../../../../org/apache/spark/ml/param/Params.html#isSet-org.apache.spark.ml.param.Param-), [onParamChange](../../../../../org/apache/spark/ml/param/Params.html#onParamChange-org.apache.spark.ml.param.Param-), [paramMap](../../../../../org/apache/spark/ml/param/Params.html#paramMap--), [params](../../../../../org/apache/spark/ml/param/Params.html#params--), [set](../../../../../org/apache/spark/ml/param/Params.html#set-org.apache.spark.ml.param.Param-T-), [set](../../../../../org/apache/spark/ml/param/Params.html#set-org.apache.spark.ml.param.ParamPair-), [set](../../../../../org/apache/spark/ml/param/Params.html#set-java.lang.String-java.lang.Object-), [setDefault](../../../../../org/apache/spark/ml/param/Params.html#setDefault-org.apache.spark.ml.param.Param-T-), [setDefault](../../../../../org/apache/spark/ml/param/Params.html#setDefault-scala.collection.Seq-), [shouldOwn](../../../../../org/apache/spark/ml/param/Params.html#shouldOwn-org.apache.spark.ml.param.Param-)` * ### Methods inherited from interface org.apache.spark.ml.util.[Identifiable](../../../../../org/apache/spark/ml/util/Identifiable.html "interface in org.apache.spark.ml.util") `[toString](../../../../../org/apache/spark/ml/util/Identifiable.html#toString--)` * ### Methods inherited from interface org.apache.spark.internal.Logging `$init$, initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, initLock, isTraceEnabled, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning, org$apache$spark$internal$Logging$$log__$eq, org$apache$spark$internal$Logging$$log_, uninitialize`
Constructor Detail
* #### Tokenizer public Tokenizer(String uid) * #### Tokenizer public Tokenizer()
Method Detail
* #### load public static [Tokenizer](../../../../../org/apache/spark/ml/feature/Tokenizer.html "class in org.apache.spark.ml.feature") load(String path) * #### read public static [MLReader](../../../../../org/apache/spark/ml/util/MLReader.html "class in org.apache.spark.ml.util")<T> read() * #### uid public String uid() An immutable unique ID for the object and its derivatives. Specified by: `[uid](../../../../../org/apache/spark/ml/util/Identifiable.html#uid--)` in interface `[Identifiable](../../../../../org/apache/spark/ml/util/Identifiable.html "interface in org.apache.spark.ml.util")` Returns: (undocumented) * #### copy public [Tokenizer](../../../../../org/apache/spark/ml/feature/Tokenizer.html "class in org.apache.spark.ml.feature") copy([ParamMap](../../../../../org/apache/spark/ml/param/ParamMap.html "class in org.apache.spark.ml.param") extra) Description copied from interface: `[Params](../../../../../org/apache/spark/ml/param/Params.html#copy-org.apache.spark.ml.param.ParamMap-)` Creates a copy of this instance with the same UID and some extra params. Subclasses should implement this method and set the return type properly. See `defaultCopy()`. Specified by: `[copy](../../../../../org/apache/spark/ml/param/Params.html#copy-org.apache.spark.ml.param.ParamMap-)` in interface `[Params](../../../../../org/apache/spark/ml/param/Params.html "interface in org.apache.spark.ml.param")` Overrides: `[copy](../../../../../org/apache/spark/ml/UnaryTransformer.html#copy-org.apache.spark.ml.param.ParamMap-)` in class `[UnaryTransformer](../../../../../org/apache/spark/ml/UnaryTransformer.html "class in org.apache.spark.ml")<String,scala.collection.Seq<String>,[Tokenizer](../../../../../org/apache/spark/ml/feature/Tokenizer.html "class in org.apache.spark.ml.feature")>` Parameters: `extra` \- (undocumented) Returns: (undocumented)