UnaryTransformer (Spark 3.5.5 JavaDoc) (original) (raw)
Object
- org.apache.spark.ml.PipelineStage
- org.apache.spark.ml.Transformer
- org.apache.spark.ml.UnaryTransformer<IN,OUT,T>
- org.apache.spark.ml.Transformer
All Implemented Interfaces:
java.io.Serializable, org.apache.spark.internal.Logging, Params, HasInputCol, HasOutputCol, Identifiable
Direct Known Subclasses:
DCT, ElementwiseProduct, NGram, Normalizer, PolynomialExpansion, RegexTokenizer, Tokenizer
public abstract class UnaryTransformer<IN,OUT,T extends UnaryTransformer<IN,OUT,T>>
extends Transformer
implements HasInputCol, HasOutputCol, org.apache.spark.internal.Logging
Abstract class for transformers that take one input column, apply transformation, and output the result as a new column.
See Also:
Serialized Form
Nested Class Summary
* ### Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging `org.apache.spark.internal.Logging.SparkShellLoggingFilter`
Constructor Summary
Constructors
Constructor and Description UnaryTransformer(scala.reflect.api.TypeTags.TypeTag<IN> evidence$1, scala.reflect.api.TypeTags.TypeTag<OUT> evidence$2) Method Summary
All Methods Instance Methods Concrete Methods
Modifier and Type Method and Description T copy(ParamMap extra) Creates a copy of this instance with the same UID and some extra params. Param inputCol() Param for input column name. Param outputCol() Param for output column name. T setInputCol(String value) T setOutputCol(String value) Dataset<Row> transform(Dataset<?> dataset) Transforms the input dataset. StructType transformSchema(StructType schema) Check transform validity and derive the output schema from the input schema. * ### Methods inherited from class org.apache.spark.ml.[Transformer](../../../../org/apache/spark/ml/Transformer.html "class in org.apache.spark.ml") `[transform](../../../../org/apache/spark/ml/Transformer.html#transform-org.apache.spark.sql.Dataset-org.apache.spark.ml.param.ParamMap-), [transform](../../../../org/apache/spark/ml/Transformer.html#transform-org.apache.spark.sql.Dataset-org.apache.spark.ml.param.ParamPair-org.apache.spark.ml.param.ParamPair...-), [transform](../../../../org/apache/spark/ml/Transformer.html#transform-org.apache.spark.sql.Dataset-org.apache.spark.ml.param.ParamPair-scala.collection.Seq-)` * ### Methods inherited from class org.apache.spark.ml.[PipelineStage](../../../../org/apache/spark/ml/PipelineStage.html "class in org.apache.spark.ml") `[params](../../../../org/apache/spark/ml/PipelineStage.html#params--)` * ### Methods inherited from class Object `equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait` * ### Methods inherited from interface org.apache.spark.ml.param.shared.[HasInputCol](../../../../org/apache/spark/ml/param/shared/HasInputCol.html "interface in org.apache.spark.ml.param.shared") `[getInputCol](../../../../org/apache/spark/ml/param/shared/HasInputCol.html#getInputCol--)` * ### Methods inherited from interface org.apache.spark.ml.param.shared.[HasOutputCol](../../../../org/apache/spark/ml/param/shared/HasOutputCol.html "interface in org.apache.spark.ml.param.shared") `[getOutputCol](../../../../org/apache/spark/ml/param/shared/HasOutputCol.html#getOutputCol--)` * ### Methods inherited from interface org.apache.spark.ml.param.[Params](../../../../org/apache/spark/ml/param/Params.html "interface in org.apache.spark.ml.param") `[clear](../../../../org/apache/spark/ml/param/Params.html#clear-org.apache.spark.ml.param.Param-), [copyValues](../../../../org/apache/spark/ml/param/Params.html#copyValues-T-org.apache.spark.ml.param.ParamMap-), [defaultCopy](../../../../org/apache/spark/ml/param/Params.html#defaultCopy-org.apache.spark.ml.param.ParamMap-), [defaultParamMap](../../../../org/apache/spark/ml/param/Params.html#defaultParamMap--), [explainParam](../../../../org/apache/spark/ml/param/Params.html#explainParam-org.apache.spark.ml.param.Param-), [explainParams](../../../../org/apache/spark/ml/param/Params.html#explainParams--), [extractParamMap](../../../../org/apache/spark/ml/param/Params.html#extractParamMap--), [extractParamMap](../../../../org/apache/spark/ml/param/Params.html#extractParamMap-org.apache.spark.ml.param.ParamMap-), [get](../../../../org/apache/spark/ml/param/Params.html#get-org.apache.spark.ml.param.Param-), [getDefault](../../../../org/apache/spark/ml/param/Params.html#getDefault-org.apache.spark.ml.param.Param-), [getOrDefault](../../../../org/apache/spark/ml/param/Params.html#getOrDefault-org.apache.spark.ml.param.Param-), [getParam](../../../../org/apache/spark/ml/param/Params.html#getParam-java.lang.String-), [hasDefault](../../../../org/apache/spark/ml/param/Params.html#hasDefault-org.apache.spark.ml.param.Param-), [hasParam](../../../../org/apache/spark/ml/param/Params.html#hasParam-java.lang.String-), [isDefined](../../../../org/apache/spark/ml/param/Params.html#isDefined-org.apache.spark.ml.param.Param-), [isSet](../../../../org/apache/spark/ml/param/Params.html#isSet-org.apache.spark.ml.param.Param-), [onParamChange](../../../../org/apache/spark/ml/param/Params.html#onParamChange-org.apache.spark.ml.param.Param-), [paramMap](../../../../org/apache/spark/ml/param/Params.html#paramMap--), [params](../../../../org/apache/spark/ml/param/Params.html#params--), [set](../../../../org/apache/spark/ml/param/Params.html#set-org.apache.spark.ml.param.Param-T-), [set](../../../../org/apache/spark/ml/param/Params.html#set-org.apache.spark.ml.param.ParamPair-), [set](../../../../org/apache/spark/ml/param/Params.html#set-java.lang.String-java.lang.Object-), [setDefault](../../../../org/apache/spark/ml/param/Params.html#setDefault-org.apache.spark.ml.param.Param-T-), [setDefault](../../../../org/apache/spark/ml/param/Params.html#setDefault-scala.collection.Seq-), [shouldOwn](../../../../org/apache/spark/ml/param/Params.html#shouldOwn-org.apache.spark.ml.param.Param-)` * ### Methods inherited from interface org.apache.spark.ml.util.[Identifiable](../../../../org/apache/spark/ml/util/Identifiable.html "interface in org.apache.spark.ml.util") `[toString](../../../../org/apache/spark/ml/util/Identifiable.html#toString--), [uid](../../../../org/apache/spark/ml/util/Identifiable.html#uid--)` * ### Methods inherited from interface org.apache.spark.internal.Logging `$init$, initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, initLock, isTraceEnabled, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning, org$apache$spark$internal$Logging$$log__$eq, org$apache$spark$internal$Logging$$log_, uninitialize`
Constructor Detail
* #### UnaryTransformer public UnaryTransformer(scala.reflect.api.TypeTags.TypeTag<[IN](../../../../org/apache/spark/ml/UnaryTransformer.html "type parameter in UnaryTransformer")> evidence$1, scala.reflect.api.TypeTags.TypeTag<[OUT](../../../../org/apache/spark/ml/UnaryTransformer.html "type parameter in UnaryTransformer")> evidence$2)
Method Detail
* #### copy public [T](../../../../org/apache/spark/ml/UnaryTransformer.html "type parameter in UnaryTransformer") copy([ParamMap](../../../../org/apache/spark/ml/param/ParamMap.html "class in org.apache.spark.ml.param") extra) Description copied from interface: `[Params](../../../../org/apache/spark/ml/param/Params.html#copy-org.apache.spark.ml.param.ParamMap-)` Creates a copy of this instance with the same UID and some extra params. Subclasses should implement this method and set the return type properly. See `defaultCopy()`. Specified by: `[copy](../../../../org/apache/spark/ml/param/Params.html#copy-org.apache.spark.ml.param.ParamMap-)` in interface `[Params](../../../../org/apache/spark/ml/param/Params.html "interface in org.apache.spark.ml.param")` Specified by: `[copy](../../../../org/apache/spark/ml/Transformer.html#copy-org.apache.spark.ml.param.ParamMap-)` in class `[Transformer](../../../../org/apache/spark/ml/Transformer.html "class in org.apache.spark.ml")` Parameters: `extra` \- (undocumented) Returns: (undocumented) * #### inputCol public final [Param](../../../../org/apache/spark/ml/param/Param.html "class in org.apache.spark.ml.param")<String> inputCol() Description copied from interface: `[HasInputCol](../../../../org/apache/spark/ml/param/shared/HasInputCol.html#inputCol--)` Param for input column name. Specified by: `[inputCol](../../../../org/apache/spark/ml/param/shared/HasInputCol.html#inputCol--)` in interface `[HasInputCol](../../../../org/apache/spark/ml/param/shared/HasInputCol.html "interface in org.apache.spark.ml.param.shared")` Returns: (undocumented) * #### outputCol public final [Param](../../../../org/apache/spark/ml/param/Param.html "class in org.apache.spark.ml.param")<String> outputCol() Param for output column name. Specified by: `[outputCol](../../../../org/apache/spark/ml/param/shared/HasOutputCol.html#outputCol--)` in interface `[HasOutputCol](../../../../org/apache/spark/ml/param/shared/HasOutputCol.html "interface in org.apache.spark.ml.param.shared")` Returns: (undocumented) * #### setInputCol public [T](../../../../org/apache/spark/ml/UnaryTransformer.html "type parameter in UnaryTransformer") setInputCol(String value) * #### setOutputCol public [T](../../../../org/apache/spark/ml/UnaryTransformer.html "type parameter in UnaryTransformer") setOutputCol(String value) * #### transform public [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")> transform([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<?> dataset) Transforms the input dataset. Specified by: `[transform](../../../../org/apache/spark/ml/Transformer.html#transform-org.apache.spark.sql.Dataset-)` in class `[Transformer](../../../../org/apache/spark/ml/Transformer.html "class in org.apache.spark.ml")` Parameters: `dataset` \- (undocumented) Returns: (undocumented) * #### transformSchema public [StructType](../../../../org/apache/spark/sql/types/StructType.html "class in org.apache.spark.sql.types") transformSchema([StructType](../../../../org/apache/spark/sql/types/StructType.html "class in org.apache.spark.sql.types") schema) Check transform validity and derive the output schema from the input schema. We check validity for interactions between parameters during `transformSchema` and raise an exception if any parameter value is invalid. Parameter value checks which do not depend on other parameters are handled by `Param.validate()`. Typical implementation should first conduct verification on schema change and parameter validity, including complex parameter interaction checks. Specified by: `[transformSchema](../../../../org/apache/spark/ml/PipelineStage.html#transformSchema-org.apache.spark.sql.types.StructType-)` in class `[PipelineStage](../../../../org/apache/spark/ml/PipelineStage.html "class in org.apache.spark.ml")` Parameters: `schema` \- (undocumented) Returns: (undocumented)