RDD (Spark 3.5.5 JavaDoc) (original) (raw)

Modifier and Type

Method and Description

<U> U

[aggregate](../../../../org/apache/spark/rdd/RDD.html#aggregate-U-scala.Function2-scala.Function2-scala.reflect.ClassTag-)(U zeroValue, scala.Function2<U,[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),U> seqOp, scala.Function2<U,U,U> combOp, scala.reflect.ClassTag<U> evidence$33)

Aggregate the elements of each partition, and then the results for all the partitions, using given combine functions and a neutral "zero value".

[RDDBarrier](../../../../org/apache/spark/rdd/RDDBarrier.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>

[barrier](../../../../org/apache/spark/rdd/RDD.html#barrier--)()

:: Experimental :: Marks the current stage as a barrier stage, where Spark must launch all tasks together.

[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>

[cache](../../../../org/apache/spark/rdd/RDD.html#cache--)()

Persist this RDD with the default storage level (MEMORY_ONLY).

<U> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<scala.Tuple2<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),U>>

[cartesian](../../../../org/apache/spark/rdd/RDD.html#cartesian-org.apache.spark.rdd.RDD-scala.reflect.ClassTag-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<U> other, scala.reflect.ClassTag<U> evidence$5)

Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of elements (a, b) where a is in this and b is in other.

void

[checkpoint](../../../../org/apache/spark/rdd/RDD.html#checkpoint--)()

Mark this RDD for checkpointing.

void

[cleanShuffleDependencies](../../../../org/apache/spark/rdd/RDD.html#cleanShuffleDependencies-boolean-)(boolean blocking)

Removes an RDD's shuffles and it's non-persisted ancestors.

[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>

[coalesce](../../../../org/apache/spark/rdd/RDD.html#coalesce-int-boolean-scala.Option-scala.math.Ordering-)(int numPartitions, boolean shuffle, scala.Option<[PartitionCoalescer](../../../../org/apache/spark/rdd/PartitionCoalescer.html "interface in org.apache.spark.rdd")> partitionCoalescer, scala.math.Ordering<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> ord)

Return a new RDD that is reduced into numPartitions partitions.

Object

[collect](../../../../org/apache/spark/rdd/RDD.html#collect--)()

Return an array that contains all of the elements in this RDD.

<U> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<U>

[collect](../../../../org/apache/spark/rdd/RDD.html#collect-scala.PartialFunction-scala.reflect.ClassTag-)(scala.PartialFunction<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),U> f, scala.reflect.ClassTag<U> evidence$32)

Return an RDD that contains all matching values by applying f.

abstract scala.collection.Iterator<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>

[compute](../../../../org/apache/spark/rdd/RDD.html#compute-org.apache.spark.Partition-org.apache.spark.TaskContext-)([Partition](../../../../org/apache/spark/Partition.html "interface in org.apache.spark") split,[TaskContext](../../../../org/apache/spark/TaskContext.html "class in org.apache.spark") context)

:: DeveloperApi :: Implemented by subclasses to compute a given partition.

[SparkContext](../../../../org/apache/spark/SparkContext.html "class in org.apache.spark")

[context](../../../../org/apache/spark/rdd/RDD.html#context--)()

long

[count](../../../../org/apache/spark/rdd/RDD.html#count--)()

Return the number of elements in the RDD.

[PartialResult](../../../../org/apache/spark/partial/PartialResult.html "class in org.apache.spark.partial")<[BoundedDouble](../../../../org/apache/spark/partial/BoundedDouble.html "class in org.apache.spark.partial")>

[countApprox](../../../../org/apache/spark/rdd/RDD.html#countApprox-long-double-)(long timeout, double confidence)

Approximate version of count() that returns a potentially incomplete result within a timeout, even if not all tasks have finished.

long

[countApproxDistinct](../../../../org/apache/spark/rdd/RDD.html#countApproxDistinct-double-)(double relativeSD)

Return approximate number of distinct elements in the RDD.

long

[countApproxDistinct](../../../../org/apache/spark/rdd/RDD.html#countApproxDistinct-int-int-)(int p, int sp)

Return approximate number of distinct elements in the RDD.

scala.collection.Map<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),Object>

[countByValue](../../../../org/apache/spark/rdd/RDD.html#countByValue-scala.math.Ordering-)(scala.math.Ordering<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> ord)

Return the count of each unique value in this RDD as a local map of (value, count) pairs.

[PartialResult](../../../../org/apache/spark/partial/PartialResult.html "class in org.apache.spark.partial")<scala.collection.Map<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),[BoundedDouble](../../../../org/apache/spark/partial/BoundedDouble.html "class in org.apache.spark.partial")>>

[countByValueApprox](../../../../org/apache/spark/rdd/RDD.html#countByValueApprox-long-double-scala.math.Ordering-)(long timeout, double confidence, scala.math.Ordering<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> ord)

Approximate version of countByValue().

scala.collection.Seq<[Dependency](../../../../org/apache/spark/Dependency.html "class in org.apache.spark")<?>>

[dependencies](../../../../org/apache/spark/rdd/RDD.html#dependencies--)()

Get the list of dependencies of this RDD, taking into account whether the RDD is checkpointed or not.

[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>

[distinct](../../../../org/apache/spark/rdd/RDD.html#distinct--)()

Return a new RDD containing the distinct elements in this RDD.

[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>

[distinct](../../../../org/apache/spark/rdd/RDD.html#distinct-int-scala.math.Ordering-)(int numPartitions, scala.math.Ordering<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> ord)

Return a new RDD containing the distinct elements in this RDD.

static [DoubleRDDFunctions](../../../../org/apache/spark/rdd/DoubleRDDFunctions.html "class in org.apache.spark.rdd")

[doubleRDDToDoubleRDDFunctions](../../../../org/apache/spark/rdd/RDD.html#doubleRDDToDoubleRDDFunctions-org.apache.spark.rdd.RDD-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<Object> rdd)

[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>

[filter](../../../../org/apache/spark/rdd/RDD.html#filter-scala.Function1-)(scala.Function1<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),Object> f)

Return a new RDD containing only the elements that satisfy a predicate.

[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")

[first](../../../../org/apache/spark/rdd/RDD.html#first--)()

Return the first element in this RDD.

<U> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<U>

[flatMap](../../../../org/apache/spark/rdd/RDD.html#flatMap-scala.Function1-scala.reflect.ClassTag-)(scala.Function1<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),scala.collection.TraversableOnce<U>> f, scala.reflect.ClassTag<U> evidence$4)

Return a new RDD by first applying a function to all elements of this RDD, and then flattening the results.

[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")

[fold](../../../../org/apache/spark/rdd/RDD.html#fold-T-scala.Function2-)([T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD") zeroValue, scala.Function2<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> op)

Aggregate the elements of each partition, and then the results for all the partitions, using a given associative function and a neutral "zero value".

void

[foreach](../../../../org/apache/spark/rdd/RDD.html#foreach-scala.Function1-)(scala.Function1<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),scala.runtime.BoxedUnit> f)

Applies a function f to all elements of this RDD.

void

[foreachPartition](../../../../org/apache/spark/rdd/RDD.html#foreachPartition-scala.Function1-)(scala.Function1<scala.collection.Iterator<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>,scala.runtime.BoxedUnit> f)

Applies a function f to each partition of this RDD.

scala.Option<String>

[getCheckpointFile](../../../../org/apache/spark/rdd/RDD.html#getCheckpointFile--)()

Gets the name of the directory to which this RDD was checkpointed.

int

[getNumPartitions](../../../../org/apache/spark/rdd/RDD.html#getNumPartitions--)()

Returns the number of partitions of this RDD.

[ResourceProfile](../../../../org/apache/spark/resource/ResourceProfile.html "class in org.apache.spark.resource")

[getResourceProfile](../../../../org/apache/spark/rdd/RDD.html#getResourceProfile--)()

Get the ResourceProfile specified with this RDD or null if it wasn't specified.

[StorageLevel](../../../../org/apache/spark/storage/StorageLevel.html "class in org.apache.spark.storage")

[getStorageLevel](../../../../org/apache/spark/rdd/RDD.html#getStorageLevel--)()

Get the RDD's current storage level, or StorageLevel.NONE if none is set.

[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<Object>

[glom](../../../../org/apache/spark/rdd/RDD.html#glom--)()

Return an RDD created by coalescing all elements within each partition into an array.

<K> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<scala.Tuple2<K,scala.collection.Iterable<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>>>

[groupBy](../../../../org/apache/spark/rdd/RDD.html#groupBy-scala.Function1-scala.reflect.ClassTag-)(scala.Function1<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),K> f, scala.reflect.ClassTag<K> kt)

Return an RDD of grouped items.

<K> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<scala.Tuple2<K,scala.collection.Iterable<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>>>

[groupBy](../../../../org/apache/spark/rdd/RDD.html#groupBy-scala.Function1-int-scala.reflect.ClassTag-)(scala.Function1<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),K> f, int numPartitions, scala.reflect.ClassTag<K> kt)

Return an RDD of grouped elements.

<K> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<scala.Tuple2<K,scala.collection.Iterable<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>>>

[groupBy](../../../../org/apache/spark/rdd/RDD.html#groupBy-scala.Function1-org.apache.spark.Partitioner-scala.reflect.ClassTag-scala.math.Ordering-)(scala.Function1<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),K> f,[Partitioner](../../../../org/apache/spark/Partitioner.html "class in org.apache.spark") p, scala.reflect.ClassTag<K> kt, scala.math.Ordering<K> ord)

Return an RDD of grouped items.

int

[id](../../../../org/apache/spark/rdd/RDD.html#id--)()

A unique ID for this RDD (within its SparkContext).

[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>

[intersection](../../../../org/apache/spark/rdd/RDD.html#intersection-org.apache.spark.rdd.RDD-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> other)

Return the intersection of this RDD and another one.

[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>

[intersection](../../../../org/apache/spark/rdd/RDD.html#intersection-org.apache.spark.rdd.RDD-int-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> other, int numPartitions)

Return the intersection of this RDD and another one.

[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>

[intersection](../../../../org/apache/spark/rdd/RDD.html#intersection-org.apache.spark.rdd.RDD-org.apache.spark.Partitioner-scala.math.Ordering-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> other,[Partitioner](../../../../org/apache/spark/Partitioner.html "class in org.apache.spark") partitioner, scala.math.Ordering<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> ord)

Return the intersection of this RDD and another one.

boolean

[isCheckpointed](../../../../org/apache/spark/rdd/RDD.html#isCheckpointed--)()

Return whether this RDD is checkpointed and materialized, either reliably or locally.

boolean

[isEmpty](../../../../org/apache/spark/rdd/RDD.html#isEmpty--)()

scala.collection.Iterator<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>

[iterator](../../../../org/apache/spark/rdd/RDD.html#iterator-org.apache.spark.Partition-org.apache.spark.TaskContext-)([Partition](../../../../org/apache/spark/Partition.html "interface in org.apache.spark") split,[TaskContext](../../../../org/apache/spark/TaskContext.html "class in org.apache.spark") context)

Internal method to this RDD; will read from cache if applicable, or otherwise compute it.

<K> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<scala.Tuple2<K,[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>>

[keyBy](../../../../org/apache/spark/rdd/RDD.html#keyBy-scala.Function1-)(scala.Function1<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),K> f)

Creates tuples of the elements in this RDD by applying f.

[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>

[localCheckpoint](../../../../org/apache/spark/rdd/RDD.html#localCheckpoint--)()

Mark this RDD for local checkpointing using Spark's existing caching layer.

<U> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<U>

[map](../../../../org/apache/spark/rdd/RDD.html#map-scala.Function1-scala.reflect.ClassTag-)(scala.Function1<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),U> f, scala.reflect.ClassTag<U> evidence$3)

Return a new RDD by applying a function to all elements of this RDD.

<U> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<U>

[mapPartitions](../../../../org/apache/spark/rdd/RDD.html#mapPartitions-scala.Function1-boolean-scala.reflect.ClassTag-)(scala.Function1<scala.collection.Iterator<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>,scala.collection.Iterator<U>> f, boolean preservesPartitioning, scala.reflect.ClassTag<U> evidence$6)

Return a new RDD by applying a function to each partition of this RDD.

<U> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<U>

[mapPartitionsWithEvaluator](../../../../org/apache/spark/rdd/RDD.html#mapPartitionsWithEvaluator-org.apache.spark.PartitionEvaluatorFactory-scala.reflect.ClassTag-)([PartitionEvaluatorFactory](../../../../org/apache/spark/PartitionEvaluatorFactory.html "interface in org.apache.spark")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),U> evaluatorFactory, scala.reflect.ClassTag<U> evidence$10)

Return a new RDD by applying an evaluator to each partition of this RDD.

<U> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<U>

[mapPartitionsWithIndex](../../../../org/apache/spark/rdd/RDD.html#mapPartitionsWithIndex-scala.Function2-boolean-scala.reflect.ClassTag-)(scala.Function2<Object,scala.collection.Iterator<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>,scala.collection.Iterator<U>> f, boolean preservesPartitioning, scala.reflect.ClassTag<U> evidence$9)

Return a new RDD by applying a function to each partition of this RDD, while tracking the index of the original partition.

[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")

[max](../../../../org/apache/spark/rdd/RDD.html#max-scala.math.Ordering-)(scala.math.Ordering<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> ord)

Returns the max of this RDD as defined by the implicit Ordering[T].

[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")

[min](../../../../org/apache/spark/rdd/RDD.html#min-scala.math.Ordering-)(scala.math.Ordering<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> ord)

Returns the min of this RDD as defined by the implicit Ordering[T].

String

[name](../../../../org/apache/spark/rdd/RDD.html#name--)()

A friendly name for this RDD

static <T> [DoubleRDDFunctions](../../../../org/apache/spark/rdd/DoubleRDDFunctions.html "class in org.apache.spark.rdd")

[numericRDDToDoubleRDDFunctions](../../../../org/apache/spark/rdd/RDD.html#numericRDDToDoubleRDDFunctions-org.apache.spark.rdd.RDD-scala.math.Numeric-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<T> rdd, scala.math.Numeric<T> num)

scala.Option<[Partitioner](../../../../org/apache/spark/Partitioner.html "class in org.apache.spark")>

[partitioner](../../../../org/apache/spark/rdd/RDD.html#partitioner--)()

Optionally overridden by subclasses to specify how they are partitioned.

[Partition](../../../../org/apache/spark/Partition.html "interface in org.apache.spark")[]

[partitions](../../../../org/apache/spark/rdd/RDD.html#partitions--)()

Get the array of partitions of this RDD, taking into account whether the RDD is checkpointed or not.

[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>

[persist](../../../../org/apache/spark/rdd/RDD.html#persist--)()

Persist this RDD with the default storage level (MEMORY_ONLY).

[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>

[persist](../../../../org/apache/spark/rdd/RDD.html#persist-org.apache.spark.storage.StorageLevel-)([StorageLevel](../../../../org/apache/spark/storage/StorageLevel.html "class in org.apache.spark.storage") newLevel)

Set this RDD's storage level to persist its values across operations after the first time it is computed.

[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<String>

[pipe](../../../../org/apache/spark/rdd/RDD.html#pipe-scala.collection.Seq-scala.collection.Map-scala.Function1-scala.Function2-boolean-int-java.lang.String-)(scala.collection.Seq<String> command, scala.collection.Map<String,String> env, scala.Function1<scala.Function1<String,scala.runtime.BoxedUnit>,scala.runtime.BoxedUnit> printPipeContext, scala.Function2<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),scala.Function1<String,scala.runtime.BoxedUnit>,scala.runtime.BoxedUnit> printRDDElement, boolean separateWorkingDir, int bufferSize, String encoding)

Return an RDD created by piping elements to a forked external process.

[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<String>

[pipe](../../../../org/apache/spark/rdd/RDD.html#pipe-java.lang.String-)(String command)

Return an RDD created by piping elements to a forked external process.

[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<String>

[pipe](../../../../org/apache/spark/rdd/RDD.html#pipe-java.lang.String-scala.collection.Map-)(String command, scala.collection.Map<String,String> env)

Return an RDD created by piping elements to a forked external process.

scala.collection.Seq<String>

[preferredLocations](../../../../org/apache/spark/rdd/RDD.html#preferredLocations-org.apache.spark.Partition-)([Partition](../../../../org/apache/spark/Partition.html "interface in org.apache.spark") split)

Get the preferred locations of a partition, taking into account whether the RDD is checkpointed.

[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>[]

[randomSplit](../../../../org/apache/spark/rdd/RDD.html#randomSplit-double:A-long-)(double[] weights, long seed)

Randomly splits this RDD with the provided weights.

static <T> [AsyncRDDActions](../../../../org/apache/spark/rdd/AsyncRDDActions.html "class in org.apache.spark.rdd")<T>

[rddToAsyncRDDActions](../../../../org/apache/spark/rdd/RDD.html#rddToAsyncRDDActions-org.apache.spark.rdd.RDD-scala.reflect.ClassTag-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<T> rdd, scala.reflect.ClassTag<T> evidence$38)

static <K,V> [OrderedRDDFunctions](../../../../org/apache/spark/rdd/OrderedRDDFunctions.html "class in org.apache.spark.rdd")<K,V,scala.Tuple2<K,V>>

[rddToOrderedRDDFunctions](../../../../org/apache/spark/rdd/RDD.html#rddToOrderedRDDFunctions-org.apache.spark.rdd.RDD-scala.math.Ordering-scala.reflect.ClassTag-scala.reflect.ClassTag-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<scala.Tuple2<K,V>> rdd, scala.math.Ordering<K> evidence$39, scala.reflect.ClassTag<K> evidence$40, scala.reflect.ClassTag<V> evidence$41)

static <K,V> [PairRDDFunctions](../../../../org/apache/spark/rdd/PairRDDFunctions.html "class in org.apache.spark.rdd")<K,V>

[rddToPairRDDFunctions](../../../../org/apache/spark/rdd/RDD.html#rddToPairRDDFunctions-org.apache.spark.rdd.RDD-scala.reflect.ClassTag-scala.reflect.ClassTag-scala.math.Ordering-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<scala.Tuple2<K,V>> rdd, scala.reflect.ClassTag<K> kt, scala.reflect.ClassTag<V> vt, scala.math.Ordering<K> ord)

static <K,V> [SequenceFileRDDFunctions](../../../../org/apache/spark/rdd/SequenceFileRDDFunctions.html "class in org.apache.spark.rdd")<K,V>

[rddToSequenceFileRDDFunctions](../../../../org/apache/spark/rdd/RDD.html#rddToSequenceFileRDDFunctions-org.apache.spark.rdd.RDD-scala.reflect.ClassTag-scala.reflect.ClassTag---)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<scala.Tuple2<K,V>> rdd, scala.reflect.ClassTag<K> kt, scala.reflect.ClassTag<V> vt, <any> keyWritableFactory, <any> valueWritableFactory)

[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")

[reduce](../../../../org/apache/spark/rdd/RDD.html#reduce-scala.Function2-)(scala.Function2<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> f)

Reduces the elements of this RDD using the specified commutative and associative binary operator.

[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>

[repartition](../../../../org/apache/spark/rdd/RDD.html#repartition-int-scala.math.Ordering-)(int numPartitions, scala.math.Ordering<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> ord)

Return a new RDD that has exactly numPartitions partitions.

[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>

[sample](../../../../org/apache/spark/rdd/RDD.html#sample-boolean-double-long-)(boolean withReplacement, double fraction, long seed)

Return a sampled subset of this RDD.

void

[saveAsObjectFile](../../../../org/apache/spark/rdd/RDD.html#saveAsObjectFile-java.lang.String-)(String path)

Save this RDD as a SequenceFile of serialized objects.

void

[saveAsTextFile](../../../../org/apache/spark/rdd/RDD.html#saveAsTextFile-java.lang.String-)(String path)

Save this RDD as a text file, using string representations of elements.

void

[saveAsTextFile](../../../../org/apache/spark/rdd/RDD.html#saveAsTextFile-java.lang.String-java.lang.Class-)(String path, Class<? extends org.apache.hadoop.io.compress.CompressionCodec> codec)

Save this RDD as a compressed text file, using string representations of elements.

[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>

[setName](../../../../org/apache/spark/rdd/RDD.html#setName-java.lang.String-)(String _name)

Assign a name to this RDD

<K> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>

[sortBy](../../../../org/apache/spark/rdd/RDD.html#sortBy-scala.Function1-boolean-int-scala.math.Ordering-scala.reflect.ClassTag-)(scala.Function1<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),K> f, boolean ascending, int numPartitions, scala.math.Ordering<K> ord, scala.reflect.ClassTag<K> ctag)

Return this RDD sorted by the given key function.

[SparkContext](../../../../org/apache/spark/SparkContext.html "class in org.apache.spark")

[sparkContext](../../../../org/apache/spark/rdd/RDD.html#sparkContext--)()

The SparkContext that created this RDD.

[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>

[subtract](../../../../org/apache/spark/rdd/RDD.html#subtract-org.apache.spark.rdd.RDD-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> other)

Return an RDD with the elements from this that are not in other.

[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>

[subtract](../../../../org/apache/spark/rdd/RDD.html#subtract-org.apache.spark.rdd.RDD-int-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> other, int numPartitions)

Return an RDD with the elements from this that are not in other.

[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>

[subtract](../../../../org/apache/spark/rdd/RDD.html#subtract-org.apache.spark.rdd.RDD-org.apache.spark.Partitioner-scala.math.Ordering-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> other,[Partitioner](../../../../org/apache/spark/Partitioner.html "class in org.apache.spark") p, scala.math.Ordering<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> ord)

Return an RDD with the elements from this that are not in other.

Object

[take](../../../../org/apache/spark/rdd/RDD.html#take-int-)(int num)

Take the first num elements of the RDD.

Object

[takeOrdered](../../../../org/apache/spark/rdd/RDD.html#takeOrdered-int-scala.math.Ordering-)(int num, scala.math.Ordering<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> ord)

Returns the first k (smallest) elements from this RDD as defined by the specified implicit Ordering[T] and maintains the ordering.

Object

[takeSample](../../../../org/apache/spark/rdd/RDD.html#takeSample-boolean-int-long-)(boolean withReplacement, int num, long seed)

Return a fixed-size sampled subset of this RDD in an array

String

[toDebugString](../../../../org/apache/spark/rdd/RDD.html#toDebugString--)()

A description of this RDD and its recursive dependencies for debugging.

[JavaRDD](../../../../org/apache/spark/api/java/JavaRDD.html "class in org.apache.spark.api.java")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>

[toJavaRDD](../../../../org/apache/spark/rdd/RDD.html#toJavaRDD--)()

scala.collection.Iterator<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>

[toLocalIterator](../../../../org/apache/spark/rdd/RDD.html#toLocalIterator--)()

Return an iterator that contains all of the elements in this RDD.

Object

[top](../../../../org/apache/spark/rdd/RDD.html#top-int-scala.math.Ordering-)(int num, scala.math.Ordering<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> ord)

Returns the top k (largest) elements from this RDD as defined by the specified implicit Ordering[T] and maintains the ordering.

String

[toString](../../../../org/apache/spark/rdd/RDD.html#toString--)()

<U> U

[treeAggregate](../../../../org/apache/spark/rdd/RDD.html#treeAggregate-U-scala.Function2-scala.Function2-int-boolean-scala.reflect.ClassTag-)(U zeroValue, scala.Function2<U,[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),U> seqOp, scala.Function2<U,U,U> combOp, int depth, boolean finalAggregateOnExecutor, scala.reflect.ClassTag<U> evidence$35)

<U> U

[treeAggregate](../../../../org/apache/spark/rdd/RDD.html#treeAggregate-U-scala.Function2-scala.Function2-int-scala.reflect.ClassTag-)(U zeroValue, scala.Function2<U,[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),U> seqOp, scala.Function2<U,U,U> combOp, int depth, scala.reflect.ClassTag<U> evidence$34)

Aggregates the elements of this RDD in a multi-level tree pattern.

[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")

[treeReduce](../../../../org/apache/spark/rdd/RDD.html#treeReduce-scala.Function2-int-)(scala.Function2<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> f, int depth)

Reduces the elements of this RDD in a multi-level tree pattern.

[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>

[union](../../../../org/apache/spark/rdd/RDD.html#union-org.apache.spark.rdd.RDD-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> other)

Return the union of this RDD and another one.

[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>

[unpersist](../../../../org/apache/spark/rdd/RDD.html#unpersist-boolean-)(boolean blocking)

Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.

[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>

[withResources](../../../../org/apache/spark/rdd/RDD.html#withResources-org.apache.spark.resource.ResourceProfile-)([ResourceProfile](../../../../org/apache/spark/resource/ResourceProfile.html "class in org.apache.spark.resource") rp)

Specify a ResourceProfile to use when calculating this RDD.

<U> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<scala.Tuple2<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),U>>

[zip](../../../../org/apache/spark/rdd/RDD.html#zip-org.apache.spark.rdd.RDD-scala.reflect.ClassTag-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<U> other, scala.reflect.ClassTag<U> evidence$13)

Zips this RDD with another one, returning key-value pairs with the first element in each RDD, second element in each RDD, etc.

<B,V> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<V>

[zipPartitions](../../../../org/apache/spark/rdd/RDD.html#zipPartitions-org.apache.spark.rdd.RDD-boolean-scala.Function2-scala.reflect.ClassTag-scala.reflect.ClassTag-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<B> rdd2, boolean preservesPartitioning, scala.Function2<scala.collection.Iterator<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>,scala.collection.Iterator<B>,scala.collection.Iterator<V>> f, scala.reflect.ClassTag<B> evidence$14, scala.reflect.ClassTag<V> evidence$15)

Zip this RDD's partitions with one (or more) RDD(s) and return a new RDD by applying a function to the zipped partitions.

<B,V> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<V>

[zipPartitions](../../../../org/apache/spark/rdd/RDD.html#zipPartitions-org.apache.spark.rdd.RDD-scala.Function2-scala.reflect.ClassTag-scala.reflect.ClassTag-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<B> rdd2, scala.Function2<scala.collection.Iterator<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>,scala.collection.Iterator<B>,scala.collection.Iterator<V>> f, scala.reflect.ClassTag<B> evidence$16, scala.reflect.ClassTag<V> evidence$17)

<B,C,V> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<V>

[zipPartitions](../../../../org/apache/spark/rdd/RDD.html#zipPartitions-org.apache.spark.rdd.RDD-org.apache.spark.rdd.RDD-boolean-scala.Function3-scala.reflect.ClassTag-scala.reflect.ClassTag-scala.reflect.ClassTag-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<B> rdd2,[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<C> rdd3, boolean preservesPartitioning, scala.Function3<scala.collection.Iterator<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>,scala.collection.Iterator<B>,scala.collection.Iterator<C>,scala.collection.Iterator<V>> f, scala.reflect.ClassTag<B> evidence$18, scala.reflect.ClassTag<C> evidence$19, scala.reflect.ClassTag<V> evidence$20)

<B,C,V> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<V>

[zipPartitions](../../../../org/apache/spark/rdd/RDD.html#zipPartitions-org.apache.spark.rdd.RDD-org.apache.spark.rdd.RDD-scala.Function3-scala.reflect.ClassTag-scala.reflect.ClassTag-scala.reflect.ClassTag-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<B> rdd2,[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<C> rdd3, scala.Function3<scala.collection.Iterator<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>,scala.collection.Iterator<B>,scala.collection.Iterator<C>,scala.collection.Iterator<V>> f, scala.reflect.ClassTag<B> evidence$21, scala.reflect.ClassTag<C> evidence$22, scala.reflect.ClassTag<V> evidence$23)

<B,C,D,V> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<V>

[zipPartitions](../../../../org/apache/spark/rdd/RDD.html#zipPartitions-org.apache.spark.rdd.RDD-org.apache.spark.rdd.RDD-org.apache.spark.rdd.RDD-boolean-scala.Function4-scala.reflect.ClassTag-scala.reflect.ClassTag-scala.reflect.ClassTag-scala.reflect.ClassTag-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<B> rdd2,[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<C> rdd3,[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<D> rdd4, boolean preservesPartitioning, scala.Function4<scala.collection.Iterator<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>,scala.collection.Iterator<B>,scala.collection.Iterator<C>,scala.collection.Iterator<D>,scala.collection.Iterator<V>> f, scala.reflect.ClassTag<B> evidence$24, scala.reflect.ClassTag<C> evidence$25, scala.reflect.ClassTag<D> evidence$26, scala.reflect.ClassTag<V> evidence$27)

<B,C,D,V> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<V>

[zipPartitions](../../../../org/apache/spark/rdd/RDD.html#zipPartitions-org.apache.spark.rdd.RDD-org.apache.spark.rdd.RDD-org.apache.spark.rdd.RDD-scala.Function4-scala.reflect.ClassTag-scala.reflect.ClassTag-scala.reflect.ClassTag-scala.reflect.ClassTag-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<B> rdd2,[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<C> rdd3,[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<D> rdd4, scala.Function4<scala.collection.Iterator<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>,scala.collection.Iterator<B>,scala.collection.Iterator<C>,scala.collection.Iterator<D>,scala.collection.Iterator<V>> f, scala.reflect.ClassTag<B> evidence$28, scala.reflect.ClassTag<C> evidence$29, scala.reflect.ClassTag<D> evidence$30, scala.reflect.ClassTag<V> evidence$31)

<U> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<U>

[zipPartitionsWithEvaluator](../../../../org/apache/spark/rdd/RDD.html#zipPartitionsWithEvaluator-org.apache.spark.rdd.RDD-org.apache.spark.PartitionEvaluatorFactory-scala.reflect.ClassTag-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> rdd2,[PartitionEvaluatorFactory](../../../../org/apache/spark/PartitionEvaluatorFactory.html "interface in org.apache.spark")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),U> evaluatorFactory, scala.reflect.ClassTag<U> evidence$11)

Zip this RDD's partitions with another RDD and return a new RDD by applying an evaluator to the zipped partitions.

[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<scala.Tuple2<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),Object>>

[zipWithIndex](../../../../org/apache/spark/rdd/RDD.html#zipWithIndex--)()

Zips this RDD with its element indices.

[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<scala.Tuple2<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),Object>>

[zipWithUniqueId](../../../../org/apache/spark/rdd/RDD.html#zipWithUniqueId--)()

Zips this RDD with generated unique Long ids.