Dataset (Spark 3.5.5 JavaDoc) (original) (raw)
Modifier and Type
Method and Description
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[agg](../../../../org/apache/spark/sql/Dataset.html#agg-org.apache.spark.sql.Column-org.apache.spark.sql.Column...-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql") expr,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")... exprs)
Aggregates on the entire Dataset without groups.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[agg](../../../../org/apache/spark/sql/Dataset.html#agg-org.apache.spark.sql.Column-scala.collection.Seq-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql") expr, scala.collection.Seq<[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> exprs)
Aggregates on the entire Dataset without groups.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[agg](../../../../org/apache/spark/sql/Dataset.html#agg-scala.collection.immutable.Map-)(scala.collection.immutable.Map<String,String> exprs)
(Scala-specific) Aggregates on the entire Dataset without groups.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[agg](../../../../org/apache/spark/sql/Dataset.html#agg-java.util.Map-)(java.util.Map<String,String> exprs)
(Java-specific) Aggregates on the entire Dataset without groups.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[agg](../../../../org/apache/spark/sql/Dataset.html#agg-scala.Tuple2-scala.collection.Seq-)(scala.Tuple2<String,String> aggExpr, scala.collection.Seq<scala.Tuple2<String,String>> aggExprs)
(Scala-specific) Aggregates on the entire Dataset without groups.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[alias](../../../../org/apache/spark/sql/Dataset.html#alias-java.lang.String-)(String alias)
Returns a new Dataset with an alias set.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[alias](../../../../org/apache/spark/sql/Dataset.html#alias-scala.Symbol-)(scala.Symbol alias)
(Scala-specific) Returns a new Dataset with an alias set.
[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")
[apply](../../../../org/apache/spark/sql/Dataset.html#apply-java.lang.String-)(String colName)
Selects column based on the column name and returns it as a Column.
<U> [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<U>
[as](../../../../org/apache/spark/sql/Dataset.html#as-org.apache.spark.sql.Encoder-)([Encoder](../../../../org/apache/spark/sql/Encoder.html "interface in org.apache.spark.sql")<U> evidence$2)
Returns a new Dataset where each record has been mapped on to the specified type.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[as](../../../../org/apache/spark/sql/Dataset.html#as-java.lang.String-)(String alias)
Returns a new Dataset with an alias set.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[as](../../../../org/apache/spark/sql/Dataset.html#as-scala.Symbol-)(scala.Symbol alias)
(Scala-specific) Returns a new Dataset with an alias set.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[cache](../../../../org/apache/spark/sql/Dataset.html#cache--)()
Persist this Dataset with the default storage level (MEMORY_AND_DISK
).
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[checkpoint](../../../../org/apache/spark/sql/Dataset.html#checkpoint--)()
Eagerly checkpoint a Dataset and return the new Dataset.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[checkpoint](../../../../org/apache/spark/sql/Dataset.html#checkpoint-boolean-)(boolean eager)
Returns a checkpointed version of this Dataset.
scala.reflect.ClassTag<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[classTag](../../../../org/apache/spark/sql/Dataset.html#classTag--)()
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[coalesce](../../../../org/apache/spark/sql/Dataset.html#coalesce-int-)(int numPartitions)
Returns a new Dataset that has exactly numPartitions
partitions, when the fewer partitions are requested.
static String
[COL_POS_KEY](../../../../org/apache/spark/sql/Dataset.html#COL%5FPOS%5FKEY--)()
[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")
[col](../../../../org/apache/spark/sql/Dataset.html#col-java.lang.String-)(String colName)
Selects column based on the column name and returns it as a Column.
Object
[collect](../../../../org/apache/spark/sql/Dataset.html#collect--)()
Returns an array that contains all rows in this Dataset.
java.util.List<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[collectAsList](../../../../org/apache/spark/sql/Dataset.html#collectAsList--)()
Returns a Java list that contains all rows in this Dataset.
[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")
[colRegex](../../../../org/apache/spark/sql/Dataset.html#colRegex-java.lang.String-)(String colName)
Selects column based on the column name specified as a regex and returns it as Column.
String[]
[columns](../../../../org/apache/spark/sql/Dataset.html#columns--)()
Returns all column names as an array.
long
[count](../../../../org/apache/spark/sql/Dataset.html#count--)()
Returns the number of rows in the Dataset.
void
[createGlobalTempView](../../../../org/apache/spark/sql/Dataset.html#createGlobalTempView-java.lang.String-)(String viewName)
Creates a global temporary view using the given name.
void
[createOrReplaceGlobalTempView](../../../../org/apache/spark/sql/Dataset.html#createOrReplaceGlobalTempView-java.lang.String-)(String viewName)
Creates or replaces a global temporary view using the given name.
void
[createOrReplaceTempView](../../../../org/apache/spark/sql/Dataset.html#createOrReplaceTempView-java.lang.String-)(String viewName)
Creates a local temporary view using the given name.
void
[createTempView](../../../../org/apache/spark/sql/Dataset.html#createTempView-java.lang.String-)(String viewName)
Creates a local temporary view using the given name.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[crossJoin](../../../../org/apache/spark/sql/Dataset.html#crossJoin-org.apache.spark.sql.Dataset-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<?> right)
Explicit cartesian join with another DataFrame
.
[RelationalGroupedDataset](../../../../org/apache/spark/sql/RelationalGroupedDataset.html "class in org.apache.spark.sql")
[cube](../../../../org/apache/spark/sql/Dataset.html#cube-org.apache.spark.sql.Column...-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")... cols)
Create a multi-dimensional cube for the current Dataset using the specified columns, so we can run aggregation on them.
[RelationalGroupedDataset](../../../../org/apache/spark/sql/RelationalGroupedDataset.html "class in org.apache.spark.sql")
[cube](../../../../org/apache/spark/sql/Dataset.html#cube-scala.collection.Seq-)(scala.collection.Seq<[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> cols)
Create a multi-dimensional cube for the current Dataset using the specified columns, so we can run aggregation on them.
[RelationalGroupedDataset](../../../../org/apache/spark/sql/RelationalGroupedDataset.html "class in org.apache.spark.sql")
[cube](../../../../org/apache/spark/sql/Dataset.html#cube-java.lang.String-scala.collection.Seq-)(String col1, scala.collection.Seq<String> cols)
Create a multi-dimensional cube for the current Dataset using the specified columns, so we can run aggregation on them.
[RelationalGroupedDataset](../../../../org/apache/spark/sql/RelationalGroupedDataset.html "class in org.apache.spark.sql")
[cube](../../../../org/apache/spark/sql/Dataset.html#cube-java.lang.String-java.lang.String...-)(String col1, String... cols)
Create a multi-dimensional cube for the current Dataset using the specified columns, so we can run aggregation on them.
static java.util.concurrent.atomic.AtomicLong
[curId](../../../../org/apache/spark/sql/Dataset.html#curId--)()
static String
[DATASET_ID_KEY](../../../../org/apache/spark/sql/Dataset.html#DATASET%5FID%5FKEY--)()
static org.apache.spark.sql.catalyst.trees.TreeNodeTag<scala.collection.mutable.HashSet<Object>>
[DATASET_ID_TAG](../../../../org/apache/spark/sql/Dataset.html#DATASET%5FID%5FTAG--)()
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[describe](../../../../org/apache/spark/sql/Dataset.html#describe-scala.collection.Seq-)(scala.collection.Seq<String> cols)
Computes basic statistics for numeric and string columns, including count, mean, stddev, min, and max.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[describe](../../../../org/apache/spark/sql/Dataset.html#describe-java.lang.String...-)(String... cols)
Computes basic statistics for numeric and string columns, including count, mean, stddev, min, and max.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[distinct](../../../../org/apache/spark/sql/Dataset.html#distinct--)()
Returns a new Dataset that contains only the unique rows from this Dataset.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[drop](../../../../org/apache/spark/sql/Dataset.html#drop-org.apache.spark.sql.Column-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql") col)
Returns a new Dataset with column dropped.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[drop](../../../../org/apache/spark/sql/Dataset.html#drop-org.apache.spark.sql.Column-org.apache.spark.sql.Column...-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql") col,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")... cols)
Returns a new Dataset with columns dropped.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[drop](../../../../org/apache/spark/sql/Dataset.html#drop-org.apache.spark.sql.Column-scala.collection.Seq-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql") col, scala.collection.Seq<[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> cols)
Returns a new Dataset with columns dropped.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[drop](../../../../org/apache/spark/sql/Dataset.html#drop-scala.collection.Seq-)(scala.collection.Seq<String> colNames)
Returns a new Dataset with columns dropped.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[drop](../../../../org/apache/spark/sql/Dataset.html#drop-java.lang.String...-)(String... colNames)
Returns a new Dataset with columns dropped.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[drop](../../../../org/apache/spark/sql/Dataset.html#drop-java.lang.String-)(String colName)
Returns a new Dataset with a column dropped.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[dropDuplicates](../../../../org/apache/spark/sql/Dataset.html#dropDuplicates--)()
Returns a new Dataset that contains only the unique rows from this Dataset.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[dropDuplicates](../../../../org/apache/spark/sql/Dataset.html#dropDuplicates-scala.collection.Seq-)(scala.collection.Seq<String> colNames)
(Scala-specific) Returns a new Dataset with duplicate rows removed, considering only the subset of columns.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[dropDuplicates](../../../../org/apache/spark/sql/Dataset.html#dropDuplicates-java.lang.String:A-)(String[] colNames)
Returns a new Dataset with duplicate rows removed, considering only the subset of columns.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[dropDuplicates](../../../../org/apache/spark/sql/Dataset.html#dropDuplicates-java.lang.String-scala.collection.Seq-)(String col1, scala.collection.Seq<String> cols)
Returns a new Dataset with duplicate rows removed, considering only the subset of columns.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[dropDuplicates](../../../../org/apache/spark/sql/Dataset.html#dropDuplicates-java.lang.String-java.lang.String...-)(String col1, String... cols)
Returns a new Dataset with duplicate rows removed, considering only the subset of columns.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[dropDuplicatesWithinWatermark](../../../../org/apache/spark/sql/Dataset.html#dropDuplicatesWithinWatermark--)()
Returns a new Dataset with duplicates rows removed, within watermark.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[dropDuplicatesWithinWatermark](../../../../org/apache/spark/sql/Dataset.html#dropDuplicatesWithinWatermark-scala.collection.Seq-)(scala.collection.Seq<String> colNames)
Returns a new Dataset with duplicates rows removed, considering only the subset of columns, within watermark.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[dropDuplicatesWithinWatermark](../../../../org/apache/spark/sql/Dataset.html#dropDuplicatesWithinWatermark-java.lang.String:A-)(String[] colNames)
Returns a new Dataset with duplicates rows removed, considering only the subset of columns, within watermark.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[dropDuplicatesWithinWatermark](../../../../org/apache/spark/sql/Dataset.html#dropDuplicatesWithinWatermark-java.lang.String-scala.collection.Seq-)(String col1, scala.collection.Seq<String> cols)
Returns a new Dataset with duplicates rows removed, considering only the subset of columns, within watermark.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[dropDuplicatesWithinWatermark](../../../../org/apache/spark/sql/Dataset.html#dropDuplicatesWithinWatermark-java.lang.String-java.lang.String...-)(String col1, String... cols)
Returns a new Dataset with duplicates rows removed, considering only the subset of columns, within watermark.
scala.Tuple2<String,String>[]
[dtypes](../../../../org/apache/spark/sql/Dataset.html#dtypes--)()
Returns all column names and their data types as an array.
[Encoder](../../../../org/apache/spark/sql/Encoder.html "interface in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[encoder](../../../../org/apache/spark/sql/Dataset.html#encoder--)()
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[except](../../../../org/apache/spark/sql/Dataset.html#except-org.apache.spark.sql.Dataset-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")> other)
Returns a new Dataset containing rows in this Dataset but not in another Dataset.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[exceptAll](../../../../org/apache/spark/sql/Dataset.html#exceptAll-org.apache.spark.sql.Dataset-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")> other)
Returns a new Dataset containing rows in this Dataset but not in another Dataset while preserving the duplicates.
void
[explain](../../../../org/apache/spark/sql/Dataset.html#explain--)()
Prints the physical plan to the console for debugging purposes.
void
[explain](../../../../org/apache/spark/sql/Dataset.html#explain-boolean-)(boolean extended)
Prints the plans (logical and physical) to the console for debugging purposes.
void
[explain](../../../../org/apache/spark/sql/Dataset.html#explain-java.lang.String-)(String mode)
Prints the plans (logical and physical) with a format specified by a given explain mode.
<A extends scala.Product> [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[explode](../../../../org/apache/spark/sql/Dataset.html#explode-scala.collection.Seq-scala.Function1-scala.reflect.api.TypeTags.TypeTag-)(scala.collection.Seq<[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> input, scala.Function1<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql"),scala.collection.TraversableOnce<A>> f, scala.reflect.api.TypeTags.TypeTag<A> evidence$4)
<A,B> [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[explode](../../../../org/apache/spark/sql/Dataset.html#explode-java.lang.String-java.lang.String-scala.Function1-scala.reflect.api.TypeTags.TypeTag-)(String inputColumn, String outputColumn, scala.Function1<A,scala.collection.TraversableOnce<B>> f, scala.reflect.api.TypeTags.TypeTag<B> evidence$5)
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[filter](../../../../org/apache/spark/sql/Dataset.html#filter-org.apache.spark.sql.Column-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql") condition)
Filters rows using the given condition.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[filter](../../../../org/apache/spark/sql/Dataset.html#filter-org.apache.spark.api.java.function.FilterFunction-)([FilterFunction](../../../../org/apache/spark/api/java/function/FilterFunction.html "interface in org.apache.spark.api.java.function")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")> func)
(Java-specific) Returns a new Dataset that only contains elements where func
returns true
.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[filter](../../../../org/apache/spark/sql/Dataset.html#filter-scala.Function1-)(scala.Function1<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),Object> func)
(Scala-specific) Returns a new Dataset that only contains elements where func
returns true
.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[filter](../../../../org/apache/spark/sql/Dataset.html#filter-java.lang.String-)(String conditionExpr)
Filters rows using the given SQL expression.
[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")
[first](../../../../org/apache/spark/sql/Dataset.html#first--)()
Returns the first row.
<U> [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<U>
[flatMap](../../../../org/apache/spark/sql/Dataset.html#flatMap-org.apache.spark.api.java.function.FlatMapFunction-org.apache.spark.sql.Encoder-)([FlatMapFunction](../../../../org/apache/spark/api/java/function/FlatMapFunction.html "interface in org.apache.spark.api.java.function")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U> f,[Encoder](../../../../org/apache/spark/sql/Encoder.html "interface in org.apache.spark.sql")<U> encoder)
(Java-specific) Returns a new Dataset by first applying a function to all elements of this Dataset, and then flattening the results.
<U> [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<U>
[flatMap](../../../../org/apache/spark/sql/Dataset.html#flatMap-scala.Function1-org.apache.spark.sql.Encoder-)(scala.Function1<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),scala.collection.TraversableOnce<U>> func,[Encoder](../../../../org/apache/spark/sql/Encoder.html "interface in org.apache.spark.sql")<U> evidence$8)
(Scala-specific) Returns a new Dataset by first applying a function to all elements of this Dataset, and then flattening the results.
void
[foreach](../../../../org/apache/spark/sql/Dataset.html#foreach-org.apache.spark.api.java.function.ForeachFunction-)([ForeachFunction](../../../../org/apache/spark/api/java/function/ForeachFunction.html "interface in org.apache.spark.api.java.function")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")> func)
(Java-specific) Runs func
on each element of this Dataset.
void
[foreach](../../../../org/apache/spark/sql/Dataset.html#foreach-scala.Function1-)(scala.Function1<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),scala.runtime.BoxedUnit> f)
Applies a function f
to all rows.
void
[foreachPartition](../../../../org/apache/spark/sql/Dataset.html#foreachPartition-org.apache.spark.api.java.function.ForeachPartitionFunction-)([ForeachPartitionFunction](../../../../org/apache/spark/api/java/function/ForeachPartitionFunction.html "interface in org.apache.spark.api.java.function")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")> func)
(Java-specific) Runs func
on each partition of this Dataset.
void
[foreachPartition](../../../../org/apache/spark/sql/Dataset.html#foreachPartition-scala.Function1-)(scala.Function1<scala.collection.Iterator<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>,scala.runtime.BoxedUnit> f)
Applies a function f
to each partition of this Dataset.
[RelationalGroupedDataset](../../../../org/apache/spark/sql/RelationalGroupedDataset.html "class in org.apache.spark.sql")
[groupBy](../../../../org/apache/spark/sql/Dataset.html#groupBy-org.apache.spark.sql.Column...-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")... cols)
Groups the Dataset using the specified columns, so we can run aggregation on them.
[RelationalGroupedDataset](../../../../org/apache/spark/sql/RelationalGroupedDataset.html "class in org.apache.spark.sql")
[groupBy](../../../../org/apache/spark/sql/Dataset.html#groupBy-scala.collection.Seq-)(scala.collection.Seq<[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> cols)
Groups the Dataset using the specified columns, so we can run aggregation on them.
[RelationalGroupedDataset](../../../../org/apache/spark/sql/RelationalGroupedDataset.html "class in org.apache.spark.sql")
[groupBy](../../../../org/apache/spark/sql/Dataset.html#groupBy-java.lang.String-scala.collection.Seq-)(String col1, scala.collection.Seq<String> cols)
Groups the Dataset using the specified columns, so that we can run aggregation on them.
[RelationalGroupedDataset](../../../../org/apache/spark/sql/RelationalGroupedDataset.html "class in org.apache.spark.sql")
[groupBy](../../../../org/apache/spark/sql/Dataset.html#groupBy-java.lang.String-java.lang.String...-)(String col1, String... cols)
Groups the Dataset using the specified columns, so that we can run aggregation on them.
<K> [KeyValueGroupedDataset](../../../../org/apache/spark/sql/KeyValueGroupedDataset.html "class in org.apache.spark.sql")<K,[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[groupByKey](../../../../org/apache/spark/sql/Dataset.html#groupByKey-scala.Function1-org.apache.spark.sql.Encoder-)(scala.Function1<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),K> func,[Encoder](../../../../org/apache/spark/sql/Encoder.html "interface in org.apache.spark.sql")<K> evidence$3)
(Scala-specific) Returns a KeyValueGroupedDataset where the data is grouped by the given key func
.
<K> [KeyValueGroupedDataset](../../../../org/apache/spark/sql/KeyValueGroupedDataset.html "class in org.apache.spark.sql")<K,[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[groupByKey](../../../../org/apache/spark/sql/Dataset.html#groupByKey-org.apache.spark.api.java.function.MapFunction-org.apache.spark.sql.Encoder-)([MapFunction](../../../../org/apache/spark/api/java/function/MapFunction.html "interface in org.apache.spark.api.java.function")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),K> func,[Encoder](../../../../org/apache/spark/sql/Encoder.html "interface in org.apache.spark.sql")<K> encoder)
(Java-specific) Returns a KeyValueGroupedDataset where the data is grouped by the given key func
.
[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")
[head](../../../../org/apache/spark/sql/Dataset.html#head--)()
Returns the first row.
Object
[head](../../../../org/apache/spark/sql/Dataset.html#head-int-)(int n)
Returns the first n
rows.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[hint](../../../../org/apache/spark/sql/Dataset.html#hint-java.lang.String-java.lang.Object...-)(String name, Object... parameters)
Specifies some hint on the current Dataset.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[hint](../../../../org/apache/spark/sql/Dataset.html#hint-java.lang.String-scala.collection.Seq-)(String name, scala.collection.Seq<Object> parameters)
Specifies some hint on the current Dataset.
String[]
[inputFiles](../../../../org/apache/spark/sql/Dataset.html#inputFiles--)()
Returns a best-effort snapshot of the files that compose this Dataset.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[intersect](../../../../org/apache/spark/sql/Dataset.html#intersect-org.apache.spark.sql.Dataset-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")> other)
Returns a new Dataset containing rows only in both this Dataset and another Dataset.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[intersectAll](../../../../org/apache/spark/sql/Dataset.html#intersectAll-org.apache.spark.sql.Dataset-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")> other)
Returns a new Dataset containing rows only in both this Dataset and another Dataset while preserving the duplicates.
boolean
[isEmpty](../../../../org/apache/spark/sql/Dataset.html#isEmpty--)()
Returns true if the Dataset
is empty.
boolean
[isLocal](../../../../org/apache/spark/sql/Dataset.html#isLocal--)()
Returns true if the collect
and take
methods can be run locally (without any Spark executors).
boolean
[isStreaming](../../../../org/apache/spark/sql/Dataset.html#isStreaming--)()
Returns true if this Dataset contains one or more sources that continuously return data as it arrives.
[JavaRDD](../../../../org/apache/spark/api/java/JavaRDD.html "class in org.apache.spark.api.java")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[javaRDD](../../../../org/apache/spark/sql/Dataset.html#javaRDD--)()
Returns the content of the Dataset as a JavaRDD
of T
s.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[join](../../../../org/apache/spark/sql/Dataset.html#join-org.apache.spark.sql.Dataset-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<?> right)
Join with another DataFrame
.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[join](../../../../org/apache/spark/sql/Dataset.html#join-org.apache.spark.sql.Dataset-org.apache.spark.sql.Column-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<?> right,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql") joinExprs)
Inner join with another DataFrame
, using the given join expression.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[join](../../../../org/apache/spark/sql/Dataset.html#join-org.apache.spark.sql.Dataset-org.apache.spark.sql.Column-java.lang.String-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<?> right,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql") joinExprs, String joinType)
Join with another DataFrame
, using the given join expression.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[join](../../../../org/apache/spark/sql/Dataset.html#join-org.apache.spark.sql.Dataset-scala.collection.Seq-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<?> right, scala.collection.Seq<String> usingColumns)
(Scala-specific) Inner equi-join with another DataFrame
using the given columns.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[join](../../../../org/apache/spark/sql/Dataset.html#join-org.apache.spark.sql.Dataset-scala.collection.Seq-java.lang.String-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<?> right, scala.collection.Seq<String> usingColumns, String joinType)
(Scala-specific) Equi-join with another DataFrame
using the given columns.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[join](../../../../org/apache/spark/sql/Dataset.html#join-org.apache.spark.sql.Dataset-java.lang.String-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<?> right, String usingColumn)
Inner equi-join with another DataFrame
using the given column.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[join](../../../../org/apache/spark/sql/Dataset.html#join-org.apache.spark.sql.Dataset-java.lang.String:A-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<?> right, String[] usingColumns)
(Java-specific) Inner equi-join with another DataFrame
using the given columns.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[join](../../../../org/apache/spark/sql/Dataset.html#join-org.apache.spark.sql.Dataset-java.lang.String:A-java.lang.String-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<?> right, String[] usingColumns, String joinType)
(Java-specific) Equi-join with another DataFrame
using the given columns.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[join](../../../../org/apache/spark/sql/Dataset.html#join-org.apache.spark.sql.Dataset-java.lang.String-java.lang.String-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<?> right, String usingColumn, String joinType)
Equi-join with another DataFrame
using the given column.
<U> [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<scala.Tuple2<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U>>
[joinWith](../../../../org/apache/spark/sql/Dataset.html#joinWith-org.apache.spark.sql.Dataset-org.apache.spark.sql.Column-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<U> other,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql") condition)
Using inner equi-join to join this Dataset returning a Tuple2
for each pair where condition
evaluates to true.
<U> [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<scala.Tuple2<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U>>
[joinWith](../../../../org/apache/spark/sql/Dataset.html#joinWith-org.apache.spark.sql.Dataset-org.apache.spark.sql.Column-java.lang.String-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<U> other,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql") condition, String joinType)
Joins this Dataset returning a Tuple2
for each pair where condition
evaluates to true.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[limit](../../../../org/apache/spark/sql/Dataset.html#limit-int-)(int n)
Returns a new Dataset by taking the first n
rows.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[localCheckpoint](../../../../org/apache/spark/sql/Dataset.html#localCheckpoint--)()
Eagerly locally checkpoints a Dataset and return the new Dataset.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[localCheckpoint](../../../../org/apache/spark/sql/Dataset.html#localCheckpoint-boolean-)(boolean eager)
Locally checkpoints a Dataset and return the new Dataset.
<U> [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<U>
[map](../../../../org/apache/spark/sql/Dataset.html#map-scala.Function1-org.apache.spark.sql.Encoder-)(scala.Function1<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U> func,[Encoder](../../../../org/apache/spark/sql/Encoder.html "interface in org.apache.spark.sql")<U> evidence$6)
(Scala-specific) Returns a new Dataset that contains the result of applying func
to each element.
<U> [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<U>
[map](../../../../org/apache/spark/sql/Dataset.html#map-org.apache.spark.api.java.function.MapFunction-org.apache.spark.sql.Encoder-)([MapFunction](../../../../org/apache/spark/api/java/function/MapFunction.html "interface in org.apache.spark.api.java.function")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U> func,[Encoder](../../../../org/apache/spark/sql/Encoder.html "interface in org.apache.spark.sql")<U> encoder)
(Java-specific) Returns a new Dataset that contains the result of applying func
to each element.
<U> [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<U>
[mapPartitions](../../../../org/apache/spark/sql/Dataset.html#mapPartitions-scala.Function1-org.apache.spark.sql.Encoder-)(scala.Function1<scala.collection.Iterator<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>,scala.collection.Iterator<U>> func,[Encoder](../../../../org/apache/spark/sql/Encoder.html "interface in org.apache.spark.sql")<U> evidence$7)
(Scala-specific) Returns a new Dataset that contains the result of applying func
to each partition.
<U> [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<U>
[mapPartitions](../../../../org/apache/spark/sql/Dataset.html#mapPartitions-org.apache.spark.api.java.function.MapPartitionsFunction-org.apache.spark.sql.Encoder-)([MapPartitionsFunction](../../../../org/apache/spark/api/java/function/MapPartitionsFunction.html "interface in org.apache.spark.api.java.function")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U> f,[Encoder](../../../../org/apache/spark/sql/Encoder.html "interface in org.apache.spark.sql")<U> encoder)
(Java-specific) Returns a new Dataset that contains the result of applying f
to each partition.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[melt](../../../../org/apache/spark/sql/Dataset.html#melt-org.apache.spark.sql.Column:A-org.apache.spark.sql.Column:A-java.lang.String-java.lang.String-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")[] ids,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")[] values, String variableColumnName, String valueColumnName)
Unpivot a DataFrame from wide format to long format, optionally leaving identifier columns set.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[melt](../../../../org/apache/spark/sql/Dataset.html#melt-org.apache.spark.sql.Column:A-java.lang.String-java.lang.String-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")[] ids, String variableColumnName, String valueColumnName)
Unpivot a DataFrame from wide format to long format, optionally leaving identifier columns set.
[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")
[metadataColumn](../../../../org/apache/spark/sql/Dataset.html#metadataColumn-java.lang.String-)(String colName)
Selects a metadata column based on its logical column name, and returns it as a Column.
[DataFrameNaFunctions](../../../../org/apache/spark/sql/DataFrameNaFunctions.html "class in org.apache.spark.sql")
[na](../../../../org/apache/spark/sql/Dataset.html#na--)()
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[observe](../../../../org/apache/spark/sql/Dataset.html#observe-org.apache.spark.sql.Observation-org.apache.spark.sql.Column-org.apache.spark.sql.Column...-)([Observation](../../../../org/apache/spark/sql/Observation.html "class in org.apache.spark.sql") observation,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql") expr,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")... exprs)
Observe (named) metrics through an org.apache.spark.sql.Observation
instance.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[observe](../../../../org/apache/spark/sql/Dataset.html#observe-org.apache.spark.sql.Observation-org.apache.spark.sql.Column-scala.collection.Seq-)([Observation](../../../../org/apache/spark/sql/Observation.html "class in org.apache.spark.sql") observation,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql") expr, scala.collection.Seq<[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> exprs)
Observe (named) metrics through an org.apache.spark.sql.Observation
instance.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[observe](../../../../org/apache/spark/sql/Dataset.html#observe-java.lang.String-org.apache.spark.sql.Column-org.apache.spark.sql.Column...-)(String name,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql") expr,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")... exprs)
Define (named) metrics to observe on the Dataset.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[observe](../../../../org/apache/spark/sql/Dataset.html#observe-java.lang.String-org.apache.spark.sql.Column-scala.collection.Seq-)(String name,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql") expr, scala.collection.Seq<[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> exprs)
Define (named) metrics to observe on the Dataset.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[offset](../../../../org/apache/spark/sql/Dataset.html#offset-int-)(int n)
Returns a new Dataset by skipping the first n
rows.
static [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[ofRows](../../../../org/apache/spark/sql/Dataset.html#ofRows-org.apache.spark.sql.SparkSession-org.apache.spark.sql.catalyst.plans.logical.LogicalPlan-)([SparkSession](../../../../org/apache/spark/sql/SparkSession.html "class in org.apache.spark.sql") sparkSession, org.apache.spark.sql.catalyst.plans.logical.LogicalPlan logicalPlan)
static [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[ofRows](../../../../org/apache/spark/sql/Dataset.html#ofRows-org.apache.spark.sql.SparkSession-org.apache.spark.sql.catalyst.plans.logical.LogicalPlan-org.apache.spark.sql.catalyst.QueryPlanningTracker-)([SparkSession](../../../../org/apache/spark/sql/SparkSession.html "class in org.apache.spark.sql") sparkSession, org.apache.spark.sql.catalyst.plans.logical.LogicalPlan logicalPlan, org.apache.spark.sql.catalyst.QueryPlanningTracker tracker)
A variant of ofRows that allows passing in a tracker so we can track query parsing time.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[orderBy](../../../../org/apache/spark/sql/Dataset.html#orderBy-org.apache.spark.sql.Column...-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")... sortExprs)
Returns a new Dataset sorted by the given expressions.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[orderBy](../../../../org/apache/spark/sql/Dataset.html#orderBy-scala.collection.Seq-)(scala.collection.Seq<[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> sortExprs)
Returns a new Dataset sorted by the given expressions.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[orderBy](../../../../org/apache/spark/sql/Dataset.html#orderBy-java.lang.String-scala.collection.Seq-)(String sortCol, scala.collection.Seq<String> sortCols)
Returns a new Dataset sorted by the given expressions.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[orderBy](../../../../org/apache/spark/sql/Dataset.html#orderBy-java.lang.String-java.lang.String...-)(String sortCol, String... sortCols)
Returns a new Dataset sorted by the given expressions.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[persist](../../../../org/apache/spark/sql/Dataset.html#persist--)()
Persist this Dataset with the default storage level (MEMORY_AND_DISK
).
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[persist](../../../../org/apache/spark/sql/Dataset.html#persist-org.apache.spark.storage.StorageLevel-)([StorageLevel](../../../../org/apache/spark/storage/StorageLevel.html "class in org.apache.spark.storage") newLevel)
Persist this Dataset with the given storage level.
void
[printSchema](../../../../org/apache/spark/sql/Dataset.html#printSchema--)()
Prints the schema to the console in a nice tree format.
void
[printSchema](../../../../org/apache/spark/sql/Dataset.html#printSchema-int-)(int level)
Prints the schema up to the given level to the console in a nice tree format.
org.apache.spark.sql.execution.QueryExecution
[queryExecution](../../../../org/apache/spark/sql/Dataset.html#queryExecution--)()
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>[]
[randomSplit](../../../../org/apache/spark/sql/Dataset.html#randomSplit-double:A-)(double[] weights)
Randomly splits this Dataset with the provided weights.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>[]
[randomSplit](../../../../org/apache/spark/sql/Dataset.html#randomSplit-double:A-long-)(double[] weights, long seed)
Randomly splits this Dataset with the provided weights.
java.util.List<[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>>
[randomSplitAsList](../../../../org/apache/spark/sql/Dataset.html#randomSplitAsList-double:A-long-)(double[] weights, long seed)
Returns a Java list that contains randomly split Dataset with the provided weights.
[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[rdd](../../../../org/apache/spark/sql/Dataset.html#rdd--)()
[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")
[reduce](../../../../org/apache/spark/sql/Dataset.html#reduce-scala.Function2-)(scala.Function2<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")> func)
(Scala-specific) Reduces the elements of this Dataset using the specified binary function.
[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")
[reduce](../../../../org/apache/spark/sql/Dataset.html#reduce-org.apache.spark.api.java.function.ReduceFunction-)([ReduceFunction](../../../../org/apache/spark/api/java/function/ReduceFunction.html "interface in org.apache.spark.api.java.function")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")> func)
(Java-specific) Reduces the elements of this Dataset using the specified binary function.
void
[registerTempTable](../../../../org/apache/spark/sql/Dataset.html#registerTempTable-java.lang.String-)(String tableName)
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[repartition](../../../../org/apache/spark/sql/Dataset.html#repartition-org.apache.spark.sql.Column...-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")... partitionExprs)
Returns a new Dataset partitioned by the given partitioning expressions, usingspark.sql.shuffle.partitions
as number of partitions.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[repartition](../../../../org/apache/spark/sql/Dataset.html#repartition-int-)(int numPartitions)
Returns a new Dataset that has exactly numPartitions
partitions.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[repartition](../../../../org/apache/spark/sql/Dataset.html#repartition-int-org.apache.spark.sql.Column...-)(int numPartitions,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")... partitionExprs)
Returns a new Dataset partitioned by the given partitioning expressions intonumPartitions
.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[repartition](../../../../org/apache/spark/sql/Dataset.html#repartition-int-scala.collection.Seq-)(int numPartitions, scala.collection.Seq<[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> partitionExprs)
Returns a new Dataset partitioned by the given partitioning expressions intonumPartitions
.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[repartition](../../../../org/apache/spark/sql/Dataset.html#repartition-scala.collection.Seq-)(scala.collection.Seq<[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> partitionExprs)
Returns a new Dataset partitioned by the given partitioning expressions, usingspark.sql.shuffle.partitions
as number of partitions.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[repartitionByRange](../../../../org/apache/spark/sql/Dataset.html#repartitionByRange-org.apache.spark.sql.Column...-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")... partitionExprs)
Returns a new Dataset partitioned by the given partitioning expressions, usingspark.sql.shuffle.partitions
as number of partitions.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[repartitionByRange](../../../../org/apache/spark/sql/Dataset.html#repartitionByRange-int-org.apache.spark.sql.Column...-)(int numPartitions,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")... partitionExprs)
Returns a new Dataset partitioned by the given partitioning expressions intonumPartitions
.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[repartitionByRange](../../../../org/apache/spark/sql/Dataset.html#repartitionByRange-int-scala.collection.Seq-)(int numPartitions, scala.collection.Seq<[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> partitionExprs)
Returns a new Dataset partitioned by the given partitioning expressions intonumPartitions
.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[repartitionByRange](../../../../org/apache/spark/sql/Dataset.html#repartitionByRange-scala.collection.Seq-)(scala.collection.Seq<[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> partitionExprs)
Returns a new Dataset partitioned by the given partitioning expressions, usingspark.sql.shuffle.partitions
as number of partitions.
[RelationalGroupedDataset](../../../../org/apache/spark/sql/RelationalGroupedDataset.html "class in org.apache.spark.sql")
[rollup](../../../../org/apache/spark/sql/Dataset.html#rollup-org.apache.spark.sql.Column...-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")... cols)
Create a multi-dimensional rollup for the current Dataset using the specified columns, so we can run aggregation on them.
[RelationalGroupedDataset](../../../../org/apache/spark/sql/RelationalGroupedDataset.html "class in org.apache.spark.sql")
[rollup](../../../../org/apache/spark/sql/Dataset.html#rollup-scala.collection.Seq-)(scala.collection.Seq<[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> cols)
Create a multi-dimensional rollup for the current Dataset using the specified columns, so we can run aggregation on them.
[RelationalGroupedDataset](../../../../org/apache/spark/sql/RelationalGroupedDataset.html "class in org.apache.spark.sql")
[rollup](../../../../org/apache/spark/sql/Dataset.html#rollup-java.lang.String-scala.collection.Seq-)(String col1, scala.collection.Seq<String> cols)
Create a multi-dimensional rollup for the current Dataset using the specified columns, so we can run aggregation on them.
[RelationalGroupedDataset](../../../../org/apache/spark/sql/RelationalGroupedDataset.html "class in org.apache.spark.sql")
[rollup](../../../../org/apache/spark/sql/Dataset.html#rollup-java.lang.String-java.lang.String...-)(String col1, String... cols)
Create a multi-dimensional rollup for the current Dataset using the specified columns, so we can run aggregation on them.
boolean
[sameSemantics](../../../../org/apache/spark/sql/Dataset.html#sameSemantics-org.apache.spark.sql.Dataset-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")> other)
Returns true
when the logical query plans inside both Datasets are equal and therefore return same results.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[sample](../../../../org/apache/spark/sql/Dataset.html#sample-boolean-double-)(boolean withReplacement, double fraction)
Returns a new Dataset by sampling a fraction of rows, using a random seed.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[sample](../../../../org/apache/spark/sql/Dataset.html#sample-boolean-double-long-)(boolean withReplacement, double fraction, long seed)
Returns a new Dataset by sampling a fraction of rows, using a user-supplied seed.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[sample](../../../../org/apache/spark/sql/Dataset.html#sample-double-)(double fraction)
Returns a new Dataset by sampling a fraction of rows (without replacement), using a random seed.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[sample](../../../../org/apache/spark/sql/Dataset.html#sample-double-long-)(double fraction, long seed)
Returns a new Dataset by sampling a fraction of rows (without replacement), using a user-supplied seed.
[StructType](../../../../org/apache/spark/sql/types/StructType.html "class in org.apache.spark.sql.types")
[schema](../../../../org/apache/spark/sql/Dataset.html#schema--)()
Returns the schema of this Dataset.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[select](../../../../org/apache/spark/sql/Dataset.html#select-org.apache.spark.sql.Column...-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")... cols)
Selects a set of column based expressions.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[select](../../../../org/apache/spark/sql/Dataset.html#select-scala.collection.Seq-)(scala.collection.Seq<[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> cols)
Selects a set of column based expressions.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[select](../../../../org/apache/spark/sql/Dataset.html#select-java.lang.String-scala.collection.Seq-)(String col, scala.collection.Seq<String> cols)
Selects a set of columns.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[select](../../../../org/apache/spark/sql/Dataset.html#select-java.lang.String-java.lang.String...-)(String col, String... cols)
Selects a set of columns.
<U1> [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<U1>
[select](../../../../org/apache/spark/sql/Dataset.html#select-org.apache.spark.sql.TypedColumn-)([TypedColumn](../../../../org/apache/spark/sql/TypedColumn.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U1> c1)
Returns a new Dataset by computing the given Column expression for each element.
<U1,U2> [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<scala.Tuple2<U1,U2>>
[select](../../../../org/apache/spark/sql/Dataset.html#select-org.apache.spark.sql.TypedColumn-org.apache.spark.sql.TypedColumn-)([TypedColumn](../../../../org/apache/spark/sql/TypedColumn.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U1> c1,[TypedColumn](../../../../org/apache/spark/sql/TypedColumn.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U2> c2)
Returns a new Dataset by computing the given Column expressions for each element.
<U1,U2,U3> [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<scala.Tuple3<U1,U2,U3>>
[select](../../../../org/apache/spark/sql/Dataset.html#select-org.apache.spark.sql.TypedColumn-org.apache.spark.sql.TypedColumn-org.apache.spark.sql.TypedColumn-)([TypedColumn](../../../../org/apache/spark/sql/TypedColumn.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U1> c1,[TypedColumn](../../../../org/apache/spark/sql/TypedColumn.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U2> c2,[TypedColumn](../../../../org/apache/spark/sql/TypedColumn.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U3> c3)
Returns a new Dataset by computing the given Column expressions for each element.
<U1,U2,U3,U4> [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<scala.Tuple4<U1,U2,U3,U4>>
[select](../../../../org/apache/spark/sql/Dataset.html#select-org.apache.spark.sql.TypedColumn-org.apache.spark.sql.TypedColumn-org.apache.spark.sql.TypedColumn-org.apache.spark.sql.TypedColumn-)([TypedColumn](../../../../org/apache/spark/sql/TypedColumn.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U1> c1,[TypedColumn](../../../../org/apache/spark/sql/TypedColumn.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U2> c2,[TypedColumn](../../../../org/apache/spark/sql/TypedColumn.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U3> c3,[TypedColumn](../../../../org/apache/spark/sql/TypedColumn.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U4> c4)
Returns a new Dataset by computing the given Column expressions for each element.
<U1,U2,U3,U4,U5> [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<scala.Tuple5<U1,U2,U3,U4,U5>>
[select](../../../../org/apache/spark/sql/Dataset.html#select-org.apache.spark.sql.TypedColumn-org.apache.spark.sql.TypedColumn-org.apache.spark.sql.TypedColumn-org.apache.spark.sql.TypedColumn-org.apache.spark.sql.TypedColumn-)([TypedColumn](../../../../org/apache/spark/sql/TypedColumn.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U1> c1,[TypedColumn](../../../../org/apache/spark/sql/TypedColumn.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U2> c2,[TypedColumn](../../../../org/apache/spark/sql/TypedColumn.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U3> c3,[TypedColumn](../../../../org/apache/spark/sql/TypedColumn.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U4> c4,[TypedColumn](../../../../org/apache/spark/sql/TypedColumn.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U5> c5)
Returns a new Dataset by computing the given Column expressions for each element.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[selectExpr](../../../../org/apache/spark/sql/Dataset.html#selectExpr-scala.collection.Seq-)(scala.collection.Seq<String> exprs)
Selects a set of SQL expressions.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[selectExpr](../../../../org/apache/spark/sql/Dataset.html#selectExpr-java.lang.String...-)(String... exprs)
Selects a set of SQL expressions.
int
[semanticHash](../../../../org/apache/spark/sql/Dataset.html#semanticHash--)()
Returns a hashCode
of the logical query plan against this Dataset.
void
[show](../../../../org/apache/spark/sql/Dataset.html#show--)()
Displays the top 20 rows of Dataset in a tabular form.
void
[show](../../../../org/apache/spark/sql/Dataset.html#show-boolean-)(boolean truncate)
Displays the top 20 rows of Dataset in a tabular form.
void
[show](../../../../org/apache/spark/sql/Dataset.html#show-int-)(int numRows)
Displays the Dataset in a tabular form.
void
[show](../../../../org/apache/spark/sql/Dataset.html#show-int-boolean-)(int numRows, boolean truncate)
Displays the Dataset in a tabular form.
void
[show](../../../../org/apache/spark/sql/Dataset.html#show-int-int-)(int numRows, int truncate)
Displays the Dataset in a tabular form.
void
[show](../../../../org/apache/spark/sql/Dataset.html#show-int-int-boolean-)(int numRows, int truncate, boolean vertical)
Displays the Dataset in a tabular form.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[sort](../../../../org/apache/spark/sql/Dataset.html#sort-org.apache.spark.sql.Column...-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")... sortExprs)
Returns a new Dataset sorted by the given expressions.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[sort](../../../../org/apache/spark/sql/Dataset.html#sort-scala.collection.Seq-)(scala.collection.Seq<[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> sortExprs)
Returns a new Dataset sorted by the given expressions.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[sort](../../../../org/apache/spark/sql/Dataset.html#sort-java.lang.String-scala.collection.Seq-)(String sortCol, scala.collection.Seq<String> sortCols)
Returns a new Dataset sorted by the specified column, all in ascending order.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[sort](../../../../org/apache/spark/sql/Dataset.html#sort-java.lang.String-java.lang.String...-)(String sortCol, String... sortCols)
Returns a new Dataset sorted by the specified column, all in ascending order.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[sortWithinPartitions](../../../../org/apache/spark/sql/Dataset.html#sortWithinPartitions-org.apache.spark.sql.Column...-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")... sortExprs)
Returns a new Dataset with each partition sorted by the given expressions.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[sortWithinPartitions](../../../../org/apache/spark/sql/Dataset.html#sortWithinPartitions-scala.collection.Seq-)(scala.collection.Seq<[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> sortExprs)
Returns a new Dataset with each partition sorted by the given expressions.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[sortWithinPartitions](../../../../org/apache/spark/sql/Dataset.html#sortWithinPartitions-java.lang.String-scala.collection.Seq-)(String sortCol, scala.collection.Seq<String> sortCols)
Returns a new Dataset with each partition sorted by the given expressions.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[sortWithinPartitions](../../../../org/apache/spark/sql/Dataset.html#sortWithinPartitions-java.lang.String-java.lang.String...-)(String sortCol, String... sortCols)
Returns a new Dataset with each partition sorted by the given expressions.
[SparkSession](../../../../org/apache/spark/sql/SparkSession.html "class in org.apache.spark.sql")
[sparkSession](../../../../org/apache/spark/sql/Dataset.html#sparkSession--)()
[SQLContext](../../../../org/apache/spark/sql/SQLContext.html "class in org.apache.spark.sql")
[sqlContext](../../../../org/apache/spark/sql/Dataset.html#sqlContext--)()
[DataFrameStatFunctions](../../../../org/apache/spark/sql/DataFrameStatFunctions.html "class in org.apache.spark.sql")
[stat](../../../../org/apache/spark/sql/Dataset.html#stat--)()
[StorageLevel](../../../../org/apache/spark/storage/StorageLevel.html "class in org.apache.spark.storage")
[storageLevel](../../../../org/apache/spark/sql/Dataset.html#storageLevel--)()
Get the Dataset's current storage level, or StorageLevel.NONE if not persisted.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[summary](../../../../org/apache/spark/sql/Dataset.html#summary-scala.collection.Seq-)(scala.collection.Seq<String> statistics)
Computes specified statistics for numeric and string columns.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[summary](../../../../org/apache/spark/sql/Dataset.html#summary-java.lang.String...-)(String... statistics)
Computes specified statistics for numeric and string columns.
Object
[tail](../../../../org/apache/spark/sql/Dataset.html#tail-int-)(int n)
Returns the last n
rows in the Dataset.
Object
[take](../../../../org/apache/spark/sql/Dataset.html#take-int-)(int n)
Returns the first n
rows in the Dataset.
java.util.List<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[takeAsList](../../../../org/apache/spark/sql/Dataset.html#takeAsList-int-)(int n)
Returns the first n
rows in the Dataset as a list.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[to](../../../../org/apache/spark/sql/Dataset.html#to-org.apache.spark.sql.types.StructType-)([StructType](../../../../org/apache/spark/sql/types/StructType.html "class in org.apache.spark.sql.types") schema)
Returns a new DataFrame where each row is reconciled to match the specified schema.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[toDF](../../../../org/apache/spark/sql/Dataset.html#toDF--)()
Converts this strongly typed collection of data to generic Dataframe.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[toDF](../../../../org/apache/spark/sql/Dataset.html#toDF-scala.collection.Seq-)(scala.collection.Seq<String> colNames)
Converts this strongly typed collection of data to generic DataFrame
with columns renamed.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[toDF](../../../../org/apache/spark/sql/Dataset.html#toDF-java.lang.String...-)(String... colNames)
Converts this strongly typed collection of data to generic DataFrame
with columns renamed.
[JavaRDD](../../../../org/apache/spark/api/java/JavaRDD.html "class in org.apache.spark.api.java")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[toJavaRDD](../../../../org/apache/spark/sql/Dataset.html#toJavaRDD--)()
Returns the content of the Dataset as a JavaRDD
of T
s.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<String>
[toJSON](../../../../org/apache/spark/sql/Dataset.html#toJSON--)()
Returns the content of the Dataset as a Dataset of JSON strings.
java.util.Iterator<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[toLocalIterator](../../../../org/apache/spark/sql/Dataset.html#toLocalIterator--)()
Returns an iterator that contains all rows in this Dataset.
String
[toString](../../../../org/apache/spark/sql/Dataset.html#toString--)()
<U> [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<U>
[transform](../../../../org/apache/spark/sql/Dataset.html#transform-scala.Function1-)(scala.Function1<[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>,[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<U>> t)
Concise syntax for chaining custom transformations.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[union](../../../../org/apache/spark/sql/Dataset.html#union-org.apache.spark.sql.Dataset-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")> other)
Returns a new Dataset containing union of rows in this Dataset and another Dataset.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[unionAll](../../../../org/apache/spark/sql/Dataset.html#unionAll-org.apache.spark.sql.Dataset-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")> other)
Returns a new Dataset containing union of rows in this Dataset and another Dataset.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[unionByName](../../../../org/apache/spark/sql/Dataset.html#unionByName-org.apache.spark.sql.Dataset-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")> other)
Returns a new Dataset containing union of rows in this Dataset and another Dataset.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[unionByName](../../../../org/apache/spark/sql/Dataset.html#unionByName-org.apache.spark.sql.Dataset-boolean-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")> other, boolean allowMissingColumns)
Returns a new Dataset containing union of rows in this Dataset and another Dataset.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[unpersist](../../../../org/apache/spark/sql/Dataset.html#unpersist--)()
Mark the Dataset as non-persistent, and remove all blocks for it from memory and disk.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[unpersist](../../../../org/apache/spark/sql/Dataset.html#unpersist-boolean-)(boolean blocking)
Mark the Dataset as non-persistent, and remove all blocks for it from memory and disk.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[unpivot](../../../../org/apache/spark/sql/Dataset.html#unpivot-org.apache.spark.sql.Column:A-org.apache.spark.sql.Column:A-java.lang.String-java.lang.String-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")[] ids,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")[] values, String variableColumnName, String valueColumnName)
Unpivot a DataFrame from wide format to long format, optionally leaving identifier columns set.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[unpivot](../../../../org/apache/spark/sql/Dataset.html#unpivot-org.apache.spark.sql.Column:A-java.lang.String-java.lang.String-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")[] ids, String variableColumnName, String valueColumnName)
Unpivot a DataFrame from wide format to long format, optionally leaving identifier columns set.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[where](../../../../org/apache/spark/sql/Dataset.html#where-org.apache.spark.sql.Column-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql") condition)
Filters rows using the given condition.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[where](../../../../org/apache/spark/sql/Dataset.html#where-java.lang.String-)(String conditionExpr)
Filters rows using the given SQL expression.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[withColumn](../../../../org/apache/spark/sql/Dataset.html#withColumn-java.lang.String-org.apache.spark.sql.Column-)(String colName,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql") col)
Returns a new Dataset by adding a column or replacing the existing column that has the same name.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[withColumnRenamed](../../../../org/apache/spark/sql/Dataset.html#withColumnRenamed-java.lang.String-java.lang.String-)(String existingName, String newName)
Returns a new Dataset with a column renamed.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[withColumns](../../../../org/apache/spark/sql/Dataset.html#withColumns-scala.collection.immutable.Map-)(scala.collection.immutable.Map<String,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> colsMap)
(Scala-specific) Returns a new Dataset by adding columns or replacing the existing columns that has the same names.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[withColumns](../../../../org/apache/spark/sql/Dataset.html#withColumns-java.util.Map-)(java.util.Map<String,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> colsMap)
(Java-specific) Returns a new Dataset by adding columns or replacing the existing columns that has the same names.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[withColumnsRenamed](../../../../org/apache/spark/sql/Dataset.html#withColumnsRenamed-scala.collection.immutable.Map-)(scala.collection.immutable.Map<String,String> colsMap)
(Scala-specific) Returns a new Dataset with a columns renamed.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[withColumnsRenamed](../../../../org/apache/spark/sql/Dataset.html#withColumnsRenamed-java.util.Map-)(java.util.Map<String,String> colsMap)
(Java-specific) Returns a new Dataset with a columns renamed.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>
[withMetadata](../../../../org/apache/spark/sql/Dataset.html#withMetadata-java.lang.String-org.apache.spark.sql.types.Metadata-)(String columnName,[Metadata](../../../../org/apache/spark/sql/types/Metadata.html "class in org.apache.spark.sql.types") metadata)
Returns a new Dataset by updating an existing column with metadata.
[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[withWatermark](../../../../org/apache/spark/sql/Dataset.html#withWatermark-java.lang.String-java.lang.String-)(String eventTime, String delayThreshold)
Defines an event time watermark for this Dataset.
[DataFrameWriter](../../../../org/apache/spark/sql/DataFrameWriter.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[write](../../../../org/apache/spark/sql/Dataset.html#write--)()
Interface for saving the content of the non-streaming Dataset out into external storage.
[DataStreamWriter](../../../../org/apache/spark/sql/streaming/DataStreamWriter.html "class in org.apache.spark.sql.streaming")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[writeStream](../../../../org/apache/spark/sql/Dataset.html#writeStream--)()
Interface for saving the content of the streaming Dataset out into external storage.
[DataFrameWriterV2](../../../../org/apache/spark/sql/DataFrameWriterV2.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>
[writeTo](../../../../org/apache/spark/sql/Dataset.html#writeTo-java.lang.String-)(String table)
Create a write configuration builder for v2 sources.