Dataset (Spark 3.5.5 JavaDoc) (original) (raw)

Modifier and Type

Method and Description

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[agg](../../../../org/apache/spark/sql/Dataset.html#agg-org.apache.spark.sql.Column-org.apache.spark.sql.Column...-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql") expr,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")... exprs)

Aggregates on the entire Dataset without groups.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[agg](../../../../org/apache/spark/sql/Dataset.html#agg-org.apache.spark.sql.Column-scala.collection.Seq-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql") expr, scala.collection.Seq<[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> exprs)

Aggregates on the entire Dataset without groups.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[agg](../../../../org/apache/spark/sql/Dataset.html#agg-scala.collection.immutable.Map-)(scala.collection.immutable.Map<String,String> exprs)

(Scala-specific) Aggregates on the entire Dataset without groups.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[agg](../../../../org/apache/spark/sql/Dataset.html#agg-java.util.Map-)(java.util.Map<String,String> exprs)

(Java-specific) Aggregates on the entire Dataset without groups.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[agg](../../../../org/apache/spark/sql/Dataset.html#agg-scala.Tuple2-scala.collection.Seq-)(scala.Tuple2<String,String> aggExpr, scala.collection.Seq<scala.Tuple2<String,String>> aggExprs)

(Scala-specific) Aggregates on the entire Dataset without groups.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[alias](../../../../org/apache/spark/sql/Dataset.html#alias-java.lang.String-)(String alias)

Returns a new Dataset with an alias set.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[alias](../../../../org/apache/spark/sql/Dataset.html#alias-scala.Symbol-)(scala.Symbol alias)

(Scala-specific) Returns a new Dataset with an alias set.

[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")

[apply](../../../../org/apache/spark/sql/Dataset.html#apply-java.lang.String-)(String colName)

Selects column based on the column name and returns it as a Column.

<U> [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<U>

[as](../../../../org/apache/spark/sql/Dataset.html#as-org.apache.spark.sql.Encoder-)([Encoder](../../../../org/apache/spark/sql/Encoder.html "interface in org.apache.spark.sql")<U> evidence$2)

Returns a new Dataset where each record has been mapped on to the specified type.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[as](../../../../org/apache/spark/sql/Dataset.html#as-java.lang.String-)(String alias)

Returns a new Dataset with an alias set.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[as](../../../../org/apache/spark/sql/Dataset.html#as-scala.Symbol-)(scala.Symbol alias)

(Scala-specific) Returns a new Dataset with an alias set.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[cache](../../../../org/apache/spark/sql/Dataset.html#cache--)()

Persist this Dataset with the default storage level (MEMORY_AND_DISK).

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[checkpoint](../../../../org/apache/spark/sql/Dataset.html#checkpoint--)()

Eagerly checkpoint a Dataset and return the new Dataset.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[checkpoint](../../../../org/apache/spark/sql/Dataset.html#checkpoint-boolean-)(boolean eager)

Returns a checkpointed version of this Dataset.

scala.reflect.ClassTag<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[classTag](../../../../org/apache/spark/sql/Dataset.html#classTag--)()

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[coalesce](../../../../org/apache/spark/sql/Dataset.html#coalesce-int-)(int numPartitions)

Returns a new Dataset that has exactly numPartitions partitions, when the fewer partitions are requested.

static String

[COL_POS_KEY](../../../../org/apache/spark/sql/Dataset.html#COL%5FPOS%5FKEY--)()

[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")

[col](../../../../org/apache/spark/sql/Dataset.html#col-java.lang.String-)(String colName)

Selects column based on the column name and returns it as a Column.

Object

[collect](../../../../org/apache/spark/sql/Dataset.html#collect--)()

Returns an array that contains all rows in this Dataset.

java.util.List<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[collectAsList](../../../../org/apache/spark/sql/Dataset.html#collectAsList--)()

Returns a Java list that contains all rows in this Dataset.

[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")

[colRegex](../../../../org/apache/spark/sql/Dataset.html#colRegex-java.lang.String-)(String colName)

Selects column based on the column name specified as a regex and returns it as Column.

String[]

[columns](../../../../org/apache/spark/sql/Dataset.html#columns--)()

Returns all column names as an array.

long

[count](../../../../org/apache/spark/sql/Dataset.html#count--)()

Returns the number of rows in the Dataset.

void

[createGlobalTempView](../../../../org/apache/spark/sql/Dataset.html#createGlobalTempView-java.lang.String-)(String viewName)

Creates a global temporary view using the given name.

void

[createOrReplaceGlobalTempView](../../../../org/apache/spark/sql/Dataset.html#createOrReplaceGlobalTempView-java.lang.String-)(String viewName)

Creates or replaces a global temporary view using the given name.

void

[createOrReplaceTempView](../../../../org/apache/spark/sql/Dataset.html#createOrReplaceTempView-java.lang.String-)(String viewName)

Creates a local temporary view using the given name.

void

[createTempView](../../../../org/apache/spark/sql/Dataset.html#createTempView-java.lang.String-)(String viewName)

Creates a local temporary view using the given name.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[crossJoin](../../../../org/apache/spark/sql/Dataset.html#crossJoin-org.apache.spark.sql.Dataset-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<?> right)

Explicit cartesian join with another DataFrame.

[RelationalGroupedDataset](../../../../org/apache/spark/sql/RelationalGroupedDataset.html "class in org.apache.spark.sql")

[cube](../../../../org/apache/spark/sql/Dataset.html#cube-org.apache.spark.sql.Column...-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")... cols)

Create a multi-dimensional cube for the current Dataset using the specified columns, so we can run aggregation on them.

[RelationalGroupedDataset](../../../../org/apache/spark/sql/RelationalGroupedDataset.html "class in org.apache.spark.sql")

[cube](../../../../org/apache/spark/sql/Dataset.html#cube-scala.collection.Seq-)(scala.collection.Seq<[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> cols)

Create a multi-dimensional cube for the current Dataset using the specified columns, so we can run aggregation on them.

[RelationalGroupedDataset](../../../../org/apache/spark/sql/RelationalGroupedDataset.html "class in org.apache.spark.sql")

[cube](../../../../org/apache/spark/sql/Dataset.html#cube-java.lang.String-scala.collection.Seq-)(String col1, scala.collection.Seq<String> cols)

Create a multi-dimensional cube for the current Dataset using the specified columns, so we can run aggregation on them.

[RelationalGroupedDataset](../../../../org/apache/spark/sql/RelationalGroupedDataset.html "class in org.apache.spark.sql")

[cube](../../../../org/apache/spark/sql/Dataset.html#cube-java.lang.String-java.lang.String...-)(String col1, String... cols)

Create a multi-dimensional cube for the current Dataset using the specified columns, so we can run aggregation on them.

static java.util.concurrent.atomic.AtomicLong

[curId](../../../../org/apache/spark/sql/Dataset.html#curId--)()

static String

[DATASET_ID_KEY](../../../../org/apache/spark/sql/Dataset.html#DATASET%5FID%5FKEY--)()

static org.apache.spark.sql.catalyst.trees.TreeNodeTag<scala.collection.mutable.HashSet<Object>>

[DATASET_ID_TAG](../../../../org/apache/spark/sql/Dataset.html#DATASET%5FID%5FTAG--)()

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[describe](../../../../org/apache/spark/sql/Dataset.html#describe-scala.collection.Seq-)(scala.collection.Seq<String> cols)

Computes basic statistics for numeric and string columns, including count, mean, stddev, min, and max.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[describe](../../../../org/apache/spark/sql/Dataset.html#describe-java.lang.String...-)(String... cols)

Computes basic statistics for numeric and string columns, including count, mean, stddev, min, and max.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[distinct](../../../../org/apache/spark/sql/Dataset.html#distinct--)()

Returns a new Dataset that contains only the unique rows from this Dataset.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[drop](../../../../org/apache/spark/sql/Dataset.html#drop-org.apache.spark.sql.Column-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql") col)

Returns a new Dataset with column dropped.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[drop](../../../../org/apache/spark/sql/Dataset.html#drop-org.apache.spark.sql.Column-org.apache.spark.sql.Column...-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql") col,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")... cols)

Returns a new Dataset with columns dropped.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[drop](../../../../org/apache/spark/sql/Dataset.html#drop-org.apache.spark.sql.Column-scala.collection.Seq-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql") col, scala.collection.Seq<[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> cols)

Returns a new Dataset with columns dropped.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[drop](../../../../org/apache/spark/sql/Dataset.html#drop-scala.collection.Seq-)(scala.collection.Seq<String> colNames)

Returns a new Dataset with columns dropped.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[drop](../../../../org/apache/spark/sql/Dataset.html#drop-java.lang.String...-)(String... colNames)

Returns a new Dataset with columns dropped.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[drop](../../../../org/apache/spark/sql/Dataset.html#drop-java.lang.String-)(String colName)

Returns a new Dataset with a column dropped.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[dropDuplicates](../../../../org/apache/spark/sql/Dataset.html#dropDuplicates--)()

Returns a new Dataset that contains only the unique rows from this Dataset.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[dropDuplicates](../../../../org/apache/spark/sql/Dataset.html#dropDuplicates-scala.collection.Seq-)(scala.collection.Seq<String> colNames)

(Scala-specific) Returns a new Dataset with duplicate rows removed, considering only the subset of columns.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[dropDuplicates](../../../../org/apache/spark/sql/Dataset.html#dropDuplicates-java.lang.String:A-)(String[] colNames)

Returns a new Dataset with duplicate rows removed, considering only the subset of columns.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[dropDuplicates](../../../../org/apache/spark/sql/Dataset.html#dropDuplicates-java.lang.String-scala.collection.Seq-)(String col1, scala.collection.Seq<String> cols)

Returns a new Dataset with duplicate rows removed, considering only the subset of columns.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[dropDuplicates](../../../../org/apache/spark/sql/Dataset.html#dropDuplicates-java.lang.String-java.lang.String...-)(String col1, String... cols)

Returns a new Dataset with duplicate rows removed, considering only the subset of columns.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[dropDuplicatesWithinWatermark](../../../../org/apache/spark/sql/Dataset.html#dropDuplicatesWithinWatermark--)()

Returns a new Dataset with duplicates rows removed, within watermark.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[dropDuplicatesWithinWatermark](../../../../org/apache/spark/sql/Dataset.html#dropDuplicatesWithinWatermark-scala.collection.Seq-)(scala.collection.Seq<String> colNames)

Returns a new Dataset with duplicates rows removed, considering only the subset of columns, within watermark.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[dropDuplicatesWithinWatermark](../../../../org/apache/spark/sql/Dataset.html#dropDuplicatesWithinWatermark-java.lang.String:A-)(String[] colNames)

Returns a new Dataset with duplicates rows removed, considering only the subset of columns, within watermark.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[dropDuplicatesWithinWatermark](../../../../org/apache/spark/sql/Dataset.html#dropDuplicatesWithinWatermark-java.lang.String-scala.collection.Seq-)(String col1, scala.collection.Seq<String> cols)

Returns a new Dataset with duplicates rows removed, considering only the subset of columns, within watermark.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[dropDuplicatesWithinWatermark](../../../../org/apache/spark/sql/Dataset.html#dropDuplicatesWithinWatermark-java.lang.String-java.lang.String...-)(String col1, String... cols)

Returns a new Dataset with duplicates rows removed, considering only the subset of columns, within watermark.

scala.Tuple2<String,String>[]

[dtypes](../../../../org/apache/spark/sql/Dataset.html#dtypes--)()

Returns all column names and their data types as an array.

[Encoder](../../../../org/apache/spark/sql/Encoder.html "interface in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[encoder](../../../../org/apache/spark/sql/Dataset.html#encoder--)()

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[except](../../../../org/apache/spark/sql/Dataset.html#except-org.apache.spark.sql.Dataset-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")> other)

Returns a new Dataset containing rows in this Dataset but not in another Dataset.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[exceptAll](../../../../org/apache/spark/sql/Dataset.html#exceptAll-org.apache.spark.sql.Dataset-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")> other)

Returns a new Dataset containing rows in this Dataset but not in another Dataset while preserving the duplicates.

void

[explain](../../../../org/apache/spark/sql/Dataset.html#explain--)()

Prints the physical plan to the console for debugging purposes.

void

[explain](../../../../org/apache/spark/sql/Dataset.html#explain-boolean-)(boolean extended)

Prints the plans (logical and physical) to the console for debugging purposes.

void

[explain](../../../../org/apache/spark/sql/Dataset.html#explain-java.lang.String-)(String mode)

Prints the plans (logical and physical) with a format specified by a given explain mode.

<A extends scala.Product> [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[explode](../../../../org/apache/spark/sql/Dataset.html#explode-scala.collection.Seq-scala.Function1-scala.reflect.api.TypeTags.TypeTag-)(scala.collection.Seq<[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> input, scala.Function1<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql"),scala.collection.TraversableOnce<A>> f, scala.reflect.api.TypeTags.TypeTag<A> evidence$4)

<A,B> [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[explode](../../../../org/apache/spark/sql/Dataset.html#explode-java.lang.String-java.lang.String-scala.Function1-scala.reflect.api.TypeTags.TypeTag-)(String inputColumn, String outputColumn, scala.Function1<A,scala.collection.TraversableOnce<B>> f, scala.reflect.api.TypeTags.TypeTag<B> evidence$5)

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[filter](../../../../org/apache/spark/sql/Dataset.html#filter-org.apache.spark.sql.Column-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql") condition)

Filters rows using the given condition.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[filter](../../../../org/apache/spark/sql/Dataset.html#filter-org.apache.spark.api.java.function.FilterFunction-)([FilterFunction](../../../../org/apache/spark/api/java/function/FilterFunction.html "interface in org.apache.spark.api.java.function")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")> func)

(Java-specific) Returns a new Dataset that only contains elements where func returns true.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[filter](../../../../org/apache/spark/sql/Dataset.html#filter-scala.Function1-)(scala.Function1<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),Object> func)

(Scala-specific) Returns a new Dataset that only contains elements where func returns true.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[filter](../../../../org/apache/spark/sql/Dataset.html#filter-java.lang.String-)(String conditionExpr)

Filters rows using the given SQL expression.

[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")

[first](../../../../org/apache/spark/sql/Dataset.html#first--)()

Returns the first row.

<U> [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<U>

[flatMap](../../../../org/apache/spark/sql/Dataset.html#flatMap-org.apache.spark.api.java.function.FlatMapFunction-org.apache.spark.sql.Encoder-)([FlatMapFunction](../../../../org/apache/spark/api/java/function/FlatMapFunction.html "interface in org.apache.spark.api.java.function")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U> f,[Encoder](../../../../org/apache/spark/sql/Encoder.html "interface in org.apache.spark.sql")<U> encoder)

(Java-specific) Returns a new Dataset by first applying a function to all elements of this Dataset, and then flattening the results.

<U> [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<U>

[flatMap](../../../../org/apache/spark/sql/Dataset.html#flatMap-scala.Function1-org.apache.spark.sql.Encoder-)(scala.Function1<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),scala.collection.TraversableOnce<U>> func,[Encoder](../../../../org/apache/spark/sql/Encoder.html "interface in org.apache.spark.sql")<U> evidence$8)

(Scala-specific) Returns a new Dataset by first applying a function to all elements of this Dataset, and then flattening the results.

void

[foreach](../../../../org/apache/spark/sql/Dataset.html#foreach-org.apache.spark.api.java.function.ForeachFunction-)([ForeachFunction](../../../../org/apache/spark/api/java/function/ForeachFunction.html "interface in org.apache.spark.api.java.function")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")> func)

(Java-specific) Runs func on each element of this Dataset.

void

[foreach](../../../../org/apache/spark/sql/Dataset.html#foreach-scala.Function1-)(scala.Function1<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),scala.runtime.BoxedUnit> f)

Applies a function f to all rows.

void

[foreachPartition](../../../../org/apache/spark/sql/Dataset.html#foreachPartition-org.apache.spark.api.java.function.ForeachPartitionFunction-)([ForeachPartitionFunction](../../../../org/apache/spark/api/java/function/ForeachPartitionFunction.html "interface in org.apache.spark.api.java.function")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")> func)

(Java-specific) Runs func on each partition of this Dataset.

void

[foreachPartition](../../../../org/apache/spark/sql/Dataset.html#foreachPartition-scala.Function1-)(scala.Function1<scala.collection.Iterator<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>,scala.runtime.BoxedUnit> f)

Applies a function f to each partition of this Dataset.

[RelationalGroupedDataset](../../../../org/apache/spark/sql/RelationalGroupedDataset.html "class in org.apache.spark.sql")

[groupBy](../../../../org/apache/spark/sql/Dataset.html#groupBy-org.apache.spark.sql.Column...-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")... cols)

Groups the Dataset using the specified columns, so we can run aggregation on them.

[RelationalGroupedDataset](../../../../org/apache/spark/sql/RelationalGroupedDataset.html "class in org.apache.spark.sql")

[groupBy](../../../../org/apache/spark/sql/Dataset.html#groupBy-scala.collection.Seq-)(scala.collection.Seq<[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> cols)

Groups the Dataset using the specified columns, so we can run aggregation on them.

[RelationalGroupedDataset](../../../../org/apache/spark/sql/RelationalGroupedDataset.html "class in org.apache.spark.sql")

[groupBy](../../../../org/apache/spark/sql/Dataset.html#groupBy-java.lang.String-scala.collection.Seq-)(String col1, scala.collection.Seq<String> cols)

Groups the Dataset using the specified columns, so that we can run aggregation on them.

[RelationalGroupedDataset](../../../../org/apache/spark/sql/RelationalGroupedDataset.html "class in org.apache.spark.sql")

[groupBy](../../../../org/apache/spark/sql/Dataset.html#groupBy-java.lang.String-java.lang.String...-)(String col1, String... cols)

Groups the Dataset using the specified columns, so that we can run aggregation on them.

<K> [KeyValueGroupedDataset](../../../../org/apache/spark/sql/KeyValueGroupedDataset.html "class in org.apache.spark.sql")<K,[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[groupByKey](../../../../org/apache/spark/sql/Dataset.html#groupByKey-scala.Function1-org.apache.spark.sql.Encoder-)(scala.Function1<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),K> func,[Encoder](../../../../org/apache/spark/sql/Encoder.html "interface in org.apache.spark.sql")<K> evidence$3)

(Scala-specific) Returns a KeyValueGroupedDataset where the data is grouped by the given key func.

<K> [KeyValueGroupedDataset](../../../../org/apache/spark/sql/KeyValueGroupedDataset.html "class in org.apache.spark.sql")<K,[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[groupByKey](../../../../org/apache/spark/sql/Dataset.html#groupByKey-org.apache.spark.api.java.function.MapFunction-org.apache.spark.sql.Encoder-)([MapFunction](../../../../org/apache/spark/api/java/function/MapFunction.html "interface in org.apache.spark.api.java.function")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),K> func,[Encoder](../../../../org/apache/spark/sql/Encoder.html "interface in org.apache.spark.sql")<K> encoder)

(Java-specific) Returns a KeyValueGroupedDataset where the data is grouped by the given key func.

[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")

[head](../../../../org/apache/spark/sql/Dataset.html#head--)()

Returns the first row.

Object

[head](../../../../org/apache/spark/sql/Dataset.html#head-int-)(int n)

Returns the first n rows.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[hint](../../../../org/apache/spark/sql/Dataset.html#hint-java.lang.String-java.lang.Object...-)(String name, Object... parameters)

Specifies some hint on the current Dataset.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[hint](../../../../org/apache/spark/sql/Dataset.html#hint-java.lang.String-scala.collection.Seq-)(String name, scala.collection.Seq<Object> parameters)

Specifies some hint on the current Dataset.

String[]

[inputFiles](../../../../org/apache/spark/sql/Dataset.html#inputFiles--)()

Returns a best-effort snapshot of the files that compose this Dataset.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[intersect](../../../../org/apache/spark/sql/Dataset.html#intersect-org.apache.spark.sql.Dataset-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")> other)

Returns a new Dataset containing rows only in both this Dataset and another Dataset.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[intersectAll](../../../../org/apache/spark/sql/Dataset.html#intersectAll-org.apache.spark.sql.Dataset-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")> other)

Returns a new Dataset containing rows only in both this Dataset and another Dataset while preserving the duplicates.

boolean

[isEmpty](../../../../org/apache/spark/sql/Dataset.html#isEmpty--)()

Returns true if the Dataset is empty.

boolean

[isLocal](../../../../org/apache/spark/sql/Dataset.html#isLocal--)()

Returns true if the collect and take methods can be run locally (without any Spark executors).

boolean

[isStreaming](../../../../org/apache/spark/sql/Dataset.html#isStreaming--)()

Returns true if this Dataset contains one or more sources that continuously return data as it arrives.

[JavaRDD](../../../../org/apache/spark/api/java/JavaRDD.html "class in org.apache.spark.api.java")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[javaRDD](../../../../org/apache/spark/sql/Dataset.html#javaRDD--)()

Returns the content of the Dataset as a JavaRDD of Ts.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[join](../../../../org/apache/spark/sql/Dataset.html#join-org.apache.spark.sql.Dataset-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<?> right)

Join with another DataFrame.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[join](../../../../org/apache/spark/sql/Dataset.html#join-org.apache.spark.sql.Dataset-org.apache.spark.sql.Column-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<?> right,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql") joinExprs)

Inner join with another DataFrame, using the given join expression.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[join](../../../../org/apache/spark/sql/Dataset.html#join-org.apache.spark.sql.Dataset-org.apache.spark.sql.Column-java.lang.String-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<?> right,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql") joinExprs, String joinType)

Join with another DataFrame, using the given join expression.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[join](../../../../org/apache/spark/sql/Dataset.html#join-org.apache.spark.sql.Dataset-scala.collection.Seq-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<?> right, scala.collection.Seq<String> usingColumns)

(Scala-specific) Inner equi-join with another DataFrame using the given columns.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[join](../../../../org/apache/spark/sql/Dataset.html#join-org.apache.spark.sql.Dataset-scala.collection.Seq-java.lang.String-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<?> right, scala.collection.Seq<String> usingColumns, String joinType)

(Scala-specific) Equi-join with another DataFrame using the given columns.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[join](../../../../org/apache/spark/sql/Dataset.html#join-org.apache.spark.sql.Dataset-java.lang.String-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<?> right, String usingColumn)

Inner equi-join with another DataFrame using the given column.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[join](../../../../org/apache/spark/sql/Dataset.html#join-org.apache.spark.sql.Dataset-java.lang.String:A-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<?> right, String[] usingColumns)

(Java-specific) Inner equi-join with another DataFrame using the given columns.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[join](../../../../org/apache/spark/sql/Dataset.html#join-org.apache.spark.sql.Dataset-java.lang.String:A-java.lang.String-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<?> right, String[] usingColumns, String joinType)

(Java-specific) Equi-join with another DataFrame using the given columns.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[join](../../../../org/apache/spark/sql/Dataset.html#join-org.apache.spark.sql.Dataset-java.lang.String-java.lang.String-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<?> right, String usingColumn, String joinType)

Equi-join with another DataFrame using the given column.

<U> [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<scala.Tuple2<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U>>

[joinWith](../../../../org/apache/spark/sql/Dataset.html#joinWith-org.apache.spark.sql.Dataset-org.apache.spark.sql.Column-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<U> other,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql") condition)

Using inner equi-join to join this Dataset returning a Tuple2 for each pair where condition evaluates to true.

<U> [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<scala.Tuple2<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U>>

[joinWith](../../../../org/apache/spark/sql/Dataset.html#joinWith-org.apache.spark.sql.Dataset-org.apache.spark.sql.Column-java.lang.String-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<U> other,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql") condition, String joinType)

Joins this Dataset returning a Tuple2 for each pair where condition evaluates to true.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[limit](../../../../org/apache/spark/sql/Dataset.html#limit-int-)(int n)

Returns a new Dataset by taking the first n rows.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[localCheckpoint](../../../../org/apache/spark/sql/Dataset.html#localCheckpoint--)()

Eagerly locally checkpoints a Dataset and return the new Dataset.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[localCheckpoint](../../../../org/apache/spark/sql/Dataset.html#localCheckpoint-boolean-)(boolean eager)

Locally checkpoints a Dataset and return the new Dataset.

<U> [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<U>

[map](../../../../org/apache/spark/sql/Dataset.html#map-scala.Function1-org.apache.spark.sql.Encoder-)(scala.Function1<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U> func,[Encoder](../../../../org/apache/spark/sql/Encoder.html "interface in org.apache.spark.sql")<U> evidence$6)

(Scala-specific) Returns a new Dataset that contains the result of applying func to each element.

<U> [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<U>

[map](../../../../org/apache/spark/sql/Dataset.html#map-org.apache.spark.api.java.function.MapFunction-org.apache.spark.sql.Encoder-)([MapFunction](../../../../org/apache/spark/api/java/function/MapFunction.html "interface in org.apache.spark.api.java.function")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U> func,[Encoder](../../../../org/apache/spark/sql/Encoder.html "interface in org.apache.spark.sql")<U> encoder)

(Java-specific) Returns a new Dataset that contains the result of applying func to each element.

<U> [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<U>

[mapPartitions](../../../../org/apache/spark/sql/Dataset.html#mapPartitions-scala.Function1-org.apache.spark.sql.Encoder-)(scala.Function1<scala.collection.Iterator<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>,scala.collection.Iterator<U>> func,[Encoder](../../../../org/apache/spark/sql/Encoder.html "interface in org.apache.spark.sql")<U> evidence$7)

(Scala-specific) Returns a new Dataset that contains the result of applying func to each partition.

<U> [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<U>

[mapPartitions](../../../../org/apache/spark/sql/Dataset.html#mapPartitions-org.apache.spark.api.java.function.MapPartitionsFunction-org.apache.spark.sql.Encoder-)([MapPartitionsFunction](../../../../org/apache/spark/api/java/function/MapPartitionsFunction.html "interface in org.apache.spark.api.java.function")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U> f,[Encoder](../../../../org/apache/spark/sql/Encoder.html "interface in org.apache.spark.sql")<U> encoder)

(Java-specific) Returns a new Dataset that contains the result of applying f to each partition.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[melt](../../../../org/apache/spark/sql/Dataset.html#melt-org.apache.spark.sql.Column:A-org.apache.spark.sql.Column:A-java.lang.String-java.lang.String-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")[] ids,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")[] values, String variableColumnName, String valueColumnName)

Unpivot a DataFrame from wide format to long format, optionally leaving identifier columns set.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[melt](../../../../org/apache/spark/sql/Dataset.html#melt-org.apache.spark.sql.Column:A-java.lang.String-java.lang.String-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")[] ids, String variableColumnName, String valueColumnName)

Unpivot a DataFrame from wide format to long format, optionally leaving identifier columns set.

[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")

[metadataColumn](../../../../org/apache/spark/sql/Dataset.html#metadataColumn-java.lang.String-)(String colName)

Selects a metadata column based on its logical column name, and returns it as a Column.

[DataFrameNaFunctions](../../../../org/apache/spark/sql/DataFrameNaFunctions.html "class in org.apache.spark.sql")

[na](../../../../org/apache/spark/sql/Dataset.html#na--)()

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[observe](../../../../org/apache/spark/sql/Dataset.html#observe-org.apache.spark.sql.Observation-org.apache.spark.sql.Column-org.apache.spark.sql.Column...-)([Observation](../../../../org/apache/spark/sql/Observation.html "class in org.apache.spark.sql") observation,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql") expr,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")... exprs)

Observe (named) metrics through an org.apache.spark.sql.Observation instance.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[observe](../../../../org/apache/spark/sql/Dataset.html#observe-org.apache.spark.sql.Observation-org.apache.spark.sql.Column-scala.collection.Seq-)([Observation](../../../../org/apache/spark/sql/Observation.html "class in org.apache.spark.sql") observation,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql") expr, scala.collection.Seq<[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> exprs)

Observe (named) metrics through an org.apache.spark.sql.Observation instance.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[observe](../../../../org/apache/spark/sql/Dataset.html#observe-java.lang.String-org.apache.spark.sql.Column-org.apache.spark.sql.Column...-)(String name,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql") expr,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")... exprs)

Define (named) metrics to observe on the Dataset.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[observe](../../../../org/apache/spark/sql/Dataset.html#observe-java.lang.String-org.apache.spark.sql.Column-scala.collection.Seq-)(String name,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql") expr, scala.collection.Seq<[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> exprs)

Define (named) metrics to observe on the Dataset.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[offset](../../../../org/apache/spark/sql/Dataset.html#offset-int-)(int n)

Returns a new Dataset by skipping the first n rows.

static [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[ofRows](../../../../org/apache/spark/sql/Dataset.html#ofRows-org.apache.spark.sql.SparkSession-org.apache.spark.sql.catalyst.plans.logical.LogicalPlan-)([SparkSession](../../../../org/apache/spark/sql/SparkSession.html "class in org.apache.spark.sql") sparkSession, org.apache.spark.sql.catalyst.plans.logical.LogicalPlan logicalPlan)

static [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[ofRows](../../../../org/apache/spark/sql/Dataset.html#ofRows-org.apache.spark.sql.SparkSession-org.apache.spark.sql.catalyst.plans.logical.LogicalPlan-org.apache.spark.sql.catalyst.QueryPlanningTracker-)([SparkSession](../../../../org/apache/spark/sql/SparkSession.html "class in org.apache.spark.sql") sparkSession, org.apache.spark.sql.catalyst.plans.logical.LogicalPlan logicalPlan, org.apache.spark.sql.catalyst.QueryPlanningTracker tracker)

A variant of ofRows that allows passing in a tracker so we can track query parsing time.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[orderBy](../../../../org/apache/spark/sql/Dataset.html#orderBy-org.apache.spark.sql.Column...-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")... sortExprs)

Returns a new Dataset sorted by the given expressions.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[orderBy](../../../../org/apache/spark/sql/Dataset.html#orderBy-scala.collection.Seq-)(scala.collection.Seq<[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> sortExprs)

Returns a new Dataset sorted by the given expressions.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[orderBy](../../../../org/apache/spark/sql/Dataset.html#orderBy-java.lang.String-scala.collection.Seq-)(String sortCol, scala.collection.Seq<String> sortCols)

Returns a new Dataset sorted by the given expressions.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[orderBy](../../../../org/apache/spark/sql/Dataset.html#orderBy-java.lang.String-java.lang.String...-)(String sortCol, String... sortCols)

Returns a new Dataset sorted by the given expressions.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[persist](../../../../org/apache/spark/sql/Dataset.html#persist--)()

Persist this Dataset with the default storage level (MEMORY_AND_DISK).

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[persist](../../../../org/apache/spark/sql/Dataset.html#persist-org.apache.spark.storage.StorageLevel-)([StorageLevel](../../../../org/apache/spark/storage/StorageLevel.html "class in org.apache.spark.storage") newLevel)

Persist this Dataset with the given storage level.

void

[printSchema](../../../../org/apache/spark/sql/Dataset.html#printSchema--)()

Prints the schema to the console in a nice tree format.

void

[printSchema](../../../../org/apache/spark/sql/Dataset.html#printSchema-int-)(int level)

Prints the schema up to the given level to the console in a nice tree format.

org.apache.spark.sql.execution.QueryExecution

[queryExecution](../../../../org/apache/spark/sql/Dataset.html#queryExecution--)()

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>[]

[randomSplit](../../../../org/apache/spark/sql/Dataset.html#randomSplit-double:A-)(double[] weights)

Randomly splits this Dataset with the provided weights.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>[]

[randomSplit](../../../../org/apache/spark/sql/Dataset.html#randomSplit-double:A-long-)(double[] weights, long seed)

Randomly splits this Dataset with the provided weights.

java.util.List<[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>>

[randomSplitAsList](../../../../org/apache/spark/sql/Dataset.html#randomSplitAsList-double:A-long-)(double[] weights, long seed)

Returns a Java list that contains randomly split Dataset with the provided weights.

[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[rdd](../../../../org/apache/spark/sql/Dataset.html#rdd--)()

[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")

[reduce](../../../../org/apache/spark/sql/Dataset.html#reduce-scala.Function2-)(scala.Function2<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")> func)

(Scala-specific) Reduces the elements of this Dataset using the specified binary function.

[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")

[reduce](../../../../org/apache/spark/sql/Dataset.html#reduce-org.apache.spark.api.java.function.ReduceFunction-)([ReduceFunction](../../../../org/apache/spark/api/java/function/ReduceFunction.html "interface in org.apache.spark.api.java.function")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")> func)

(Java-specific) Reduces the elements of this Dataset using the specified binary function.

void

[registerTempTable](../../../../org/apache/spark/sql/Dataset.html#registerTempTable-java.lang.String-)(String tableName)

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[repartition](../../../../org/apache/spark/sql/Dataset.html#repartition-org.apache.spark.sql.Column...-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")... partitionExprs)

Returns a new Dataset partitioned by the given partitioning expressions, usingspark.sql.shuffle.partitions as number of partitions.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[repartition](../../../../org/apache/spark/sql/Dataset.html#repartition-int-)(int numPartitions)

Returns a new Dataset that has exactly numPartitions partitions.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[repartition](../../../../org/apache/spark/sql/Dataset.html#repartition-int-org.apache.spark.sql.Column...-)(int numPartitions,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")... partitionExprs)

Returns a new Dataset partitioned by the given partitioning expressions intonumPartitions.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[repartition](../../../../org/apache/spark/sql/Dataset.html#repartition-int-scala.collection.Seq-)(int numPartitions, scala.collection.Seq<[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> partitionExprs)

Returns a new Dataset partitioned by the given partitioning expressions intonumPartitions.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[repartition](../../../../org/apache/spark/sql/Dataset.html#repartition-scala.collection.Seq-)(scala.collection.Seq<[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> partitionExprs)

Returns a new Dataset partitioned by the given partitioning expressions, usingspark.sql.shuffle.partitions as number of partitions.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[repartitionByRange](../../../../org/apache/spark/sql/Dataset.html#repartitionByRange-org.apache.spark.sql.Column...-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")... partitionExprs)

Returns a new Dataset partitioned by the given partitioning expressions, usingspark.sql.shuffle.partitions as number of partitions.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[repartitionByRange](../../../../org/apache/spark/sql/Dataset.html#repartitionByRange-int-org.apache.spark.sql.Column...-)(int numPartitions,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")... partitionExprs)

Returns a new Dataset partitioned by the given partitioning expressions intonumPartitions.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[repartitionByRange](../../../../org/apache/spark/sql/Dataset.html#repartitionByRange-int-scala.collection.Seq-)(int numPartitions, scala.collection.Seq<[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> partitionExprs)

Returns a new Dataset partitioned by the given partitioning expressions intonumPartitions.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[repartitionByRange](../../../../org/apache/spark/sql/Dataset.html#repartitionByRange-scala.collection.Seq-)(scala.collection.Seq<[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> partitionExprs)

Returns a new Dataset partitioned by the given partitioning expressions, usingspark.sql.shuffle.partitions as number of partitions.

[RelationalGroupedDataset](../../../../org/apache/spark/sql/RelationalGroupedDataset.html "class in org.apache.spark.sql")

[rollup](../../../../org/apache/spark/sql/Dataset.html#rollup-org.apache.spark.sql.Column...-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")... cols)

Create a multi-dimensional rollup for the current Dataset using the specified columns, so we can run aggregation on them.

[RelationalGroupedDataset](../../../../org/apache/spark/sql/RelationalGroupedDataset.html "class in org.apache.spark.sql")

[rollup](../../../../org/apache/spark/sql/Dataset.html#rollup-scala.collection.Seq-)(scala.collection.Seq<[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> cols)

Create a multi-dimensional rollup for the current Dataset using the specified columns, so we can run aggregation on them.

[RelationalGroupedDataset](../../../../org/apache/spark/sql/RelationalGroupedDataset.html "class in org.apache.spark.sql")

[rollup](../../../../org/apache/spark/sql/Dataset.html#rollup-java.lang.String-scala.collection.Seq-)(String col1, scala.collection.Seq<String> cols)

Create a multi-dimensional rollup for the current Dataset using the specified columns, so we can run aggregation on them.

[RelationalGroupedDataset](../../../../org/apache/spark/sql/RelationalGroupedDataset.html "class in org.apache.spark.sql")

[rollup](../../../../org/apache/spark/sql/Dataset.html#rollup-java.lang.String-java.lang.String...-)(String col1, String... cols)

Create a multi-dimensional rollup for the current Dataset using the specified columns, so we can run aggregation on them.

boolean

[sameSemantics](../../../../org/apache/spark/sql/Dataset.html#sameSemantics-org.apache.spark.sql.Dataset-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")> other)

Returns true when the logical query plans inside both Datasets are equal and therefore return same results.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[sample](../../../../org/apache/spark/sql/Dataset.html#sample-boolean-double-)(boolean withReplacement, double fraction)

Returns a new Dataset by sampling a fraction of rows, using a random seed.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[sample](../../../../org/apache/spark/sql/Dataset.html#sample-boolean-double-long-)(boolean withReplacement, double fraction, long seed)

Returns a new Dataset by sampling a fraction of rows, using a user-supplied seed.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[sample](../../../../org/apache/spark/sql/Dataset.html#sample-double-)(double fraction)

Returns a new Dataset by sampling a fraction of rows (without replacement), using a random seed.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[sample](../../../../org/apache/spark/sql/Dataset.html#sample-double-long-)(double fraction, long seed)

Returns a new Dataset by sampling a fraction of rows (without replacement), using a user-supplied seed.

[StructType](../../../../org/apache/spark/sql/types/StructType.html "class in org.apache.spark.sql.types")

[schema](../../../../org/apache/spark/sql/Dataset.html#schema--)()

Returns the schema of this Dataset.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[select](../../../../org/apache/spark/sql/Dataset.html#select-org.apache.spark.sql.Column...-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")... cols)

Selects a set of column based expressions.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[select](../../../../org/apache/spark/sql/Dataset.html#select-scala.collection.Seq-)(scala.collection.Seq<[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> cols)

Selects a set of column based expressions.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[select](../../../../org/apache/spark/sql/Dataset.html#select-java.lang.String-scala.collection.Seq-)(String col, scala.collection.Seq<String> cols)

Selects a set of columns.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[select](../../../../org/apache/spark/sql/Dataset.html#select-java.lang.String-java.lang.String...-)(String col, String... cols)

Selects a set of columns.

<U1> [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<U1>

[select](../../../../org/apache/spark/sql/Dataset.html#select-org.apache.spark.sql.TypedColumn-)([TypedColumn](../../../../org/apache/spark/sql/TypedColumn.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U1> c1)

Returns a new Dataset by computing the given Column expression for each element.

<U1,U2> [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<scala.Tuple2<U1,U2>>

[select](../../../../org/apache/spark/sql/Dataset.html#select-org.apache.spark.sql.TypedColumn-org.apache.spark.sql.TypedColumn-)([TypedColumn](../../../../org/apache/spark/sql/TypedColumn.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U1> c1,[TypedColumn](../../../../org/apache/spark/sql/TypedColumn.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U2> c2)

Returns a new Dataset by computing the given Column expressions for each element.

<U1,U2,U3> [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<scala.Tuple3<U1,U2,U3>>

[select](../../../../org/apache/spark/sql/Dataset.html#select-org.apache.spark.sql.TypedColumn-org.apache.spark.sql.TypedColumn-org.apache.spark.sql.TypedColumn-)([TypedColumn](../../../../org/apache/spark/sql/TypedColumn.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U1> c1,[TypedColumn](../../../../org/apache/spark/sql/TypedColumn.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U2> c2,[TypedColumn](../../../../org/apache/spark/sql/TypedColumn.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U3> c3)

Returns a new Dataset by computing the given Column expressions for each element.

<U1,U2,U3,U4> [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<scala.Tuple4<U1,U2,U3,U4>>

[select](../../../../org/apache/spark/sql/Dataset.html#select-org.apache.spark.sql.TypedColumn-org.apache.spark.sql.TypedColumn-org.apache.spark.sql.TypedColumn-org.apache.spark.sql.TypedColumn-)([TypedColumn](../../../../org/apache/spark/sql/TypedColumn.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U1> c1,[TypedColumn](../../../../org/apache/spark/sql/TypedColumn.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U2> c2,[TypedColumn](../../../../org/apache/spark/sql/TypedColumn.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U3> c3,[TypedColumn](../../../../org/apache/spark/sql/TypedColumn.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U4> c4)

Returns a new Dataset by computing the given Column expressions for each element.

<U1,U2,U3,U4,U5> [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<scala.Tuple5<U1,U2,U3,U4,U5>>

[select](../../../../org/apache/spark/sql/Dataset.html#select-org.apache.spark.sql.TypedColumn-org.apache.spark.sql.TypedColumn-org.apache.spark.sql.TypedColumn-org.apache.spark.sql.TypedColumn-org.apache.spark.sql.TypedColumn-)([TypedColumn](../../../../org/apache/spark/sql/TypedColumn.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U1> c1,[TypedColumn](../../../../org/apache/spark/sql/TypedColumn.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U2> c2,[TypedColumn](../../../../org/apache/spark/sql/TypedColumn.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U3> c3,[TypedColumn](../../../../org/apache/spark/sql/TypedColumn.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U4> c4,[TypedColumn](../../../../org/apache/spark/sql/TypedColumn.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset"),U5> c5)

Returns a new Dataset by computing the given Column expressions for each element.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[selectExpr](../../../../org/apache/spark/sql/Dataset.html#selectExpr-scala.collection.Seq-)(scala.collection.Seq<String> exprs)

Selects a set of SQL expressions.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[selectExpr](../../../../org/apache/spark/sql/Dataset.html#selectExpr-java.lang.String...-)(String... exprs)

Selects a set of SQL expressions.

int

[semanticHash](../../../../org/apache/spark/sql/Dataset.html#semanticHash--)()

Returns a hashCode of the logical query plan against this Dataset.

void

[show](../../../../org/apache/spark/sql/Dataset.html#show--)()

Displays the top 20 rows of Dataset in a tabular form.

void

[show](../../../../org/apache/spark/sql/Dataset.html#show-boolean-)(boolean truncate)

Displays the top 20 rows of Dataset in a tabular form.

void

[show](../../../../org/apache/spark/sql/Dataset.html#show-int-)(int numRows)

Displays the Dataset in a tabular form.

void

[show](../../../../org/apache/spark/sql/Dataset.html#show-int-boolean-)(int numRows, boolean truncate)

Displays the Dataset in a tabular form.

void

[show](../../../../org/apache/spark/sql/Dataset.html#show-int-int-)(int numRows, int truncate)

Displays the Dataset in a tabular form.

void

[show](../../../../org/apache/spark/sql/Dataset.html#show-int-int-boolean-)(int numRows, int truncate, boolean vertical)

Displays the Dataset in a tabular form.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[sort](../../../../org/apache/spark/sql/Dataset.html#sort-org.apache.spark.sql.Column...-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")... sortExprs)

Returns a new Dataset sorted by the given expressions.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[sort](../../../../org/apache/spark/sql/Dataset.html#sort-scala.collection.Seq-)(scala.collection.Seq<[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> sortExprs)

Returns a new Dataset sorted by the given expressions.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[sort](../../../../org/apache/spark/sql/Dataset.html#sort-java.lang.String-scala.collection.Seq-)(String sortCol, scala.collection.Seq<String> sortCols)

Returns a new Dataset sorted by the specified column, all in ascending order.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[sort](../../../../org/apache/spark/sql/Dataset.html#sort-java.lang.String-java.lang.String...-)(String sortCol, String... sortCols)

Returns a new Dataset sorted by the specified column, all in ascending order.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[sortWithinPartitions](../../../../org/apache/spark/sql/Dataset.html#sortWithinPartitions-org.apache.spark.sql.Column...-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")... sortExprs)

Returns a new Dataset with each partition sorted by the given expressions.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[sortWithinPartitions](../../../../org/apache/spark/sql/Dataset.html#sortWithinPartitions-scala.collection.Seq-)(scala.collection.Seq<[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> sortExprs)

Returns a new Dataset with each partition sorted by the given expressions.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[sortWithinPartitions](../../../../org/apache/spark/sql/Dataset.html#sortWithinPartitions-java.lang.String-scala.collection.Seq-)(String sortCol, scala.collection.Seq<String> sortCols)

Returns a new Dataset with each partition sorted by the given expressions.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[sortWithinPartitions](../../../../org/apache/spark/sql/Dataset.html#sortWithinPartitions-java.lang.String-java.lang.String...-)(String sortCol, String... sortCols)

Returns a new Dataset with each partition sorted by the given expressions.

[SparkSession](../../../../org/apache/spark/sql/SparkSession.html "class in org.apache.spark.sql")

[sparkSession](../../../../org/apache/spark/sql/Dataset.html#sparkSession--)()

[SQLContext](../../../../org/apache/spark/sql/SQLContext.html "class in org.apache.spark.sql")

[sqlContext](../../../../org/apache/spark/sql/Dataset.html#sqlContext--)()

[DataFrameStatFunctions](../../../../org/apache/spark/sql/DataFrameStatFunctions.html "class in org.apache.spark.sql")

[stat](../../../../org/apache/spark/sql/Dataset.html#stat--)()

[StorageLevel](../../../../org/apache/spark/storage/StorageLevel.html "class in org.apache.spark.storage")

[storageLevel](../../../../org/apache/spark/sql/Dataset.html#storageLevel--)()

Get the Dataset's current storage level, or StorageLevel.NONE if not persisted.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[summary](../../../../org/apache/spark/sql/Dataset.html#summary-scala.collection.Seq-)(scala.collection.Seq<String> statistics)

Computes specified statistics for numeric and string columns.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[summary](../../../../org/apache/spark/sql/Dataset.html#summary-java.lang.String...-)(String... statistics)

Computes specified statistics for numeric and string columns.

Object

[tail](../../../../org/apache/spark/sql/Dataset.html#tail-int-)(int n)

Returns the last n rows in the Dataset.

Object

[take](../../../../org/apache/spark/sql/Dataset.html#take-int-)(int n)

Returns the first n rows in the Dataset.

java.util.List<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[takeAsList](../../../../org/apache/spark/sql/Dataset.html#takeAsList-int-)(int n)

Returns the first n rows in the Dataset as a list.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[to](../../../../org/apache/spark/sql/Dataset.html#to-org.apache.spark.sql.types.StructType-)([StructType](../../../../org/apache/spark/sql/types/StructType.html "class in org.apache.spark.sql.types") schema)

Returns a new DataFrame where each row is reconciled to match the specified schema.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[toDF](../../../../org/apache/spark/sql/Dataset.html#toDF--)()

Converts this strongly typed collection of data to generic Dataframe.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[toDF](../../../../org/apache/spark/sql/Dataset.html#toDF-scala.collection.Seq-)(scala.collection.Seq<String> colNames)

Converts this strongly typed collection of data to generic DataFrame with columns renamed.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[toDF](../../../../org/apache/spark/sql/Dataset.html#toDF-java.lang.String...-)(String... colNames)

Converts this strongly typed collection of data to generic DataFrame with columns renamed.

[JavaRDD](../../../../org/apache/spark/api/java/JavaRDD.html "class in org.apache.spark.api.java")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[toJavaRDD](../../../../org/apache/spark/sql/Dataset.html#toJavaRDD--)()

Returns the content of the Dataset as a JavaRDD of Ts.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<String>

[toJSON](../../../../org/apache/spark/sql/Dataset.html#toJSON--)()

Returns the content of the Dataset as a Dataset of JSON strings.

java.util.Iterator<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[toLocalIterator](../../../../org/apache/spark/sql/Dataset.html#toLocalIterator--)()

Returns an iterator that contains all rows in this Dataset.

String

[toString](../../../../org/apache/spark/sql/Dataset.html#toString--)()

<U> [Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<U>

[transform](../../../../org/apache/spark/sql/Dataset.html#transform-scala.Function1-)(scala.Function1<[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>,[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<U>> t)

Concise syntax for chaining custom transformations.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[union](../../../../org/apache/spark/sql/Dataset.html#union-org.apache.spark.sql.Dataset-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")> other)

Returns a new Dataset containing union of rows in this Dataset and another Dataset.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[unionAll](../../../../org/apache/spark/sql/Dataset.html#unionAll-org.apache.spark.sql.Dataset-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")> other)

Returns a new Dataset containing union of rows in this Dataset and another Dataset.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[unionByName](../../../../org/apache/spark/sql/Dataset.html#unionByName-org.apache.spark.sql.Dataset-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")> other)

Returns a new Dataset containing union of rows in this Dataset and another Dataset.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[unionByName](../../../../org/apache/spark/sql/Dataset.html#unionByName-org.apache.spark.sql.Dataset-boolean-)([Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")> other, boolean allowMissingColumns)

Returns a new Dataset containing union of rows in this Dataset and another Dataset.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[unpersist](../../../../org/apache/spark/sql/Dataset.html#unpersist--)()

Mark the Dataset as non-persistent, and remove all blocks for it from memory and disk.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[unpersist](../../../../org/apache/spark/sql/Dataset.html#unpersist-boolean-)(boolean blocking)

Mark the Dataset as non-persistent, and remove all blocks for it from memory and disk.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[unpivot](../../../../org/apache/spark/sql/Dataset.html#unpivot-org.apache.spark.sql.Column:A-org.apache.spark.sql.Column:A-java.lang.String-java.lang.String-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")[] ids,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")[] values, String variableColumnName, String valueColumnName)

Unpivot a DataFrame from wide format to long format, optionally leaving identifier columns set.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[unpivot](../../../../org/apache/spark/sql/Dataset.html#unpivot-org.apache.spark.sql.Column:A-java.lang.String-java.lang.String-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")[] ids, String variableColumnName, String valueColumnName)

Unpivot a DataFrame from wide format to long format, optionally leaving identifier columns set.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[where](../../../../org/apache/spark/sql/Dataset.html#where-org.apache.spark.sql.Column-)([Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql") condition)

Filters rows using the given condition.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[where](../../../../org/apache/spark/sql/Dataset.html#where-java.lang.String-)(String conditionExpr)

Filters rows using the given SQL expression.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[withColumn](../../../../org/apache/spark/sql/Dataset.html#withColumn-java.lang.String-org.apache.spark.sql.Column-)(String colName,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql") col)

Returns a new Dataset by adding a column or replacing the existing column that has the same name.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[withColumnRenamed](../../../../org/apache/spark/sql/Dataset.html#withColumnRenamed-java.lang.String-java.lang.String-)(String existingName, String newName)

Returns a new Dataset with a column renamed.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[withColumns](../../../../org/apache/spark/sql/Dataset.html#withColumns-scala.collection.immutable.Map-)(scala.collection.immutable.Map<String,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> colsMap)

(Scala-specific) Returns a new Dataset by adding columns or replacing the existing columns that has the same names.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[withColumns](../../../../org/apache/spark/sql/Dataset.html#withColumns-java.util.Map-)(java.util.Map<String,[Column](../../../../org/apache/spark/sql/Column.html "class in org.apache.spark.sql")> colsMap)

(Java-specific) Returns a new Dataset by adding columns or replacing the existing columns that has the same names.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[withColumnsRenamed](../../../../org/apache/spark/sql/Dataset.html#withColumnsRenamed-scala.collection.immutable.Map-)(scala.collection.immutable.Map<String,String> colsMap)

(Scala-specific) Returns a new Dataset with a columns renamed.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[withColumnsRenamed](../../../../org/apache/spark/sql/Dataset.html#withColumnsRenamed-java.util.Map-)(java.util.Map<String,String> colsMap)

(Java-specific) Returns a new Dataset with a columns renamed.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[Row](../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")>

[withMetadata](../../../../org/apache/spark/sql/Dataset.html#withMetadata-java.lang.String-org.apache.spark.sql.types.Metadata-)(String columnName,[Metadata](../../../../org/apache/spark/sql/types/Metadata.html "class in org.apache.spark.sql.types") metadata)

Returns a new Dataset by updating an existing column with metadata.

[Dataset](../../../../org/apache/spark/sql/Dataset.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[withWatermark](../../../../org/apache/spark/sql/Dataset.html#withWatermark-java.lang.String-java.lang.String-)(String eventTime, String delayThreshold)

Defines an event time watermark for this Dataset.

[DataFrameWriter](../../../../org/apache/spark/sql/DataFrameWriter.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[write](../../../../org/apache/spark/sql/Dataset.html#write--)()

Interface for saving the content of the non-streaming Dataset out into external storage.

[DataStreamWriter](../../../../org/apache/spark/sql/streaming/DataStreamWriter.html "class in org.apache.spark.sql.streaming")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[writeStream](../../../../org/apache/spark/sql/Dataset.html#writeStream--)()

Interface for saving the content of the streaming Dataset out into external storage.

[DataFrameWriterV2](../../../../org/apache/spark/sql/DataFrameWriterV2.html "class in org.apache.spark.sql")<[T](../../../../org/apache/spark/sql/Dataset.html "type parameter in Dataset")>

[writeTo](../../../../org/apache/spark/sql/Dataset.html#writeTo-java.lang.String-)(String table)

Create a write configuration builder for v2 sources.