RDD (Spark 3.5.5 JavaDoc) (original) (raw)
Modifier and Type
Method and Description
<U> U
[aggregate](../../../../org/apache/spark/rdd/RDD.html#aggregate-U-scala.Function2-scala.Function2-scala.reflect.ClassTag-)(U zeroValue, scala.Function2<U,[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),U> seqOp, scala.Function2<U,U,U> combOp, scala.reflect.ClassTag<U> evidence$33)
Aggregate the elements of each partition, and then the results for all the partitions, using given combine functions and a neutral "zero value".
[RDDBarrier](../../../../org/apache/spark/rdd/RDDBarrier.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>
[barrier](../../../../org/apache/spark/rdd/RDD.html#barrier--)()
:: Experimental :: Marks the current stage as a barrier stage, where Spark must launch all tasks together.
[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>
[cache](../../../../org/apache/spark/rdd/RDD.html#cache--)()
Persist this RDD with the default storage level (MEMORY_ONLY
).
<U> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<scala.Tuple2<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),U>>
[cartesian](../../../../org/apache/spark/rdd/RDD.html#cartesian-org.apache.spark.rdd.RDD-scala.reflect.ClassTag-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<U> other, scala.reflect.ClassTag<U> evidence$5)
Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of elements (a, b) where a is in this
and b is in other
.
void
[checkpoint](../../../../org/apache/spark/rdd/RDD.html#checkpoint--)()
Mark this RDD for checkpointing.
void
[cleanShuffleDependencies](../../../../org/apache/spark/rdd/RDD.html#cleanShuffleDependencies-boolean-)(boolean blocking)
Removes an RDD's shuffles and it's non-persisted ancestors.
[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>
[coalesce](../../../../org/apache/spark/rdd/RDD.html#coalesce-int-boolean-scala.Option-scala.math.Ordering-)(int numPartitions, boolean shuffle, scala.Option<[PartitionCoalescer](../../../../org/apache/spark/rdd/PartitionCoalescer.html "interface in org.apache.spark.rdd")> partitionCoalescer, scala.math.Ordering<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> ord)
Return a new RDD that is reduced into numPartitions
partitions.
Object
[collect](../../../../org/apache/spark/rdd/RDD.html#collect--)()
Return an array that contains all of the elements in this RDD.
<U> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<U>
[collect](../../../../org/apache/spark/rdd/RDD.html#collect-scala.PartialFunction-scala.reflect.ClassTag-)(scala.PartialFunction<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),U> f, scala.reflect.ClassTag<U> evidence$32)
Return an RDD that contains all matching values by applying f
.
abstract scala.collection.Iterator<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>
[compute](../../../../org/apache/spark/rdd/RDD.html#compute-org.apache.spark.Partition-org.apache.spark.TaskContext-)([Partition](../../../../org/apache/spark/Partition.html "interface in org.apache.spark") split,[TaskContext](../../../../org/apache/spark/TaskContext.html "class in org.apache.spark") context)
:: DeveloperApi :: Implemented by subclasses to compute a given partition.
[SparkContext](../../../../org/apache/spark/SparkContext.html "class in org.apache.spark")
[context](../../../../org/apache/spark/rdd/RDD.html#context--)()
long
[count](../../../../org/apache/spark/rdd/RDD.html#count--)()
Return the number of elements in the RDD.
[PartialResult](../../../../org/apache/spark/partial/PartialResult.html "class in org.apache.spark.partial")<[BoundedDouble](../../../../org/apache/spark/partial/BoundedDouble.html "class in org.apache.spark.partial")>
[countApprox](../../../../org/apache/spark/rdd/RDD.html#countApprox-long-double-)(long timeout, double confidence)
Approximate version of count() that returns a potentially incomplete result within a timeout, even if not all tasks have finished.
long
[countApproxDistinct](../../../../org/apache/spark/rdd/RDD.html#countApproxDistinct-double-)(double relativeSD)
Return approximate number of distinct elements in the RDD.
long
[countApproxDistinct](../../../../org/apache/spark/rdd/RDD.html#countApproxDistinct-int-int-)(int p, int sp)
Return approximate number of distinct elements in the RDD.
scala.collection.Map<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),Object>
[countByValue](../../../../org/apache/spark/rdd/RDD.html#countByValue-scala.math.Ordering-)(scala.math.Ordering<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> ord)
Return the count of each unique value in this RDD as a local map of (value, count) pairs.
[PartialResult](../../../../org/apache/spark/partial/PartialResult.html "class in org.apache.spark.partial")<scala.collection.Map<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),[BoundedDouble](../../../../org/apache/spark/partial/BoundedDouble.html "class in org.apache.spark.partial")>>
[countByValueApprox](../../../../org/apache/spark/rdd/RDD.html#countByValueApprox-long-double-scala.math.Ordering-)(long timeout, double confidence, scala.math.Ordering<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> ord)
Approximate version of countByValue().
scala.collection.Seq<[Dependency](../../../../org/apache/spark/Dependency.html "class in org.apache.spark")<?>>
[dependencies](../../../../org/apache/spark/rdd/RDD.html#dependencies--)()
Get the list of dependencies of this RDD, taking into account whether the RDD is checkpointed or not.
[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>
[distinct](../../../../org/apache/spark/rdd/RDD.html#distinct--)()
Return a new RDD containing the distinct elements in this RDD.
[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>
[distinct](../../../../org/apache/spark/rdd/RDD.html#distinct-int-scala.math.Ordering-)(int numPartitions, scala.math.Ordering<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> ord)
Return a new RDD containing the distinct elements in this RDD.
static [DoubleRDDFunctions](../../../../org/apache/spark/rdd/DoubleRDDFunctions.html "class in org.apache.spark.rdd")
[doubleRDDToDoubleRDDFunctions](../../../../org/apache/spark/rdd/RDD.html#doubleRDDToDoubleRDDFunctions-org.apache.spark.rdd.RDD-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<Object> rdd)
[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>
[filter](../../../../org/apache/spark/rdd/RDD.html#filter-scala.Function1-)(scala.Function1<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),Object> f)
Return a new RDD containing only the elements that satisfy a predicate.
[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")
[first](../../../../org/apache/spark/rdd/RDD.html#first--)()
Return the first element in this RDD.
<U> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<U>
[flatMap](../../../../org/apache/spark/rdd/RDD.html#flatMap-scala.Function1-scala.reflect.ClassTag-)(scala.Function1<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),scala.collection.TraversableOnce<U>> f, scala.reflect.ClassTag<U> evidence$4)
Return a new RDD by first applying a function to all elements of this RDD, and then flattening the results.
[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")
[fold](../../../../org/apache/spark/rdd/RDD.html#fold-T-scala.Function2-)([T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD") zeroValue, scala.Function2<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> op)
Aggregate the elements of each partition, and then the results for all the partitions, using a given associative function and a neutral "zero value".
void
[foreach](../../../../org/apache/spark/rdd/RDD.html#foreach-scala.Function1-)(scala.Function1<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),scala.runtime.BoxedUnit> f)
Applies a function f to all elements of this RDD.
void
[foreachPartition](../../../../org/apache/spark/rdd/RDD.html#foreachPartition-scala.Function1-)(scala.Function1<scala.collection.Iterator<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>,scala.runtime.BoxedUnit> f)
Applies a function f to each partition of this RDD.
scala.Option<String>
[getCheckpointFile](../../../../org/apache/spark/rdd/RDD.html#getCheckpointFile--)()
Gets the name of the directory to which this RDD was checkpointed.
int
[getNumPartitions](../../../../org/apache/spark/rdd/RDD.html#getNumPartitions--)()
Returns the number of partitions of this RDD.
[ResourceProfile](../../../../org/apache/spark/resource/ResourceProfile.html "class in org.apache.spark.resource")
[getResourceProfile](../../../../org/apache/spark/rdd/RDD.html#getResourceProfile--)()
Get the ResourceProfile specified with this RDD or null if it wasn't specified.
[StorageLevel](../../../../org/apache/spark/storage/StorageLevel.html "class in org.apache.spark.storage")
[getStorageLevel](../../../../org/apache/spark/rdd/RDD.html#getStorageLevel--)()
Get the RDD's current storage level, or StorageLevel.NONE if none is set.
[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<Object>
[glom](../../../../org/apache/spark/rdd/RDD.html#glom--)()
Return an RDD created by coalescing all elements within each partition into an array.
<K> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<scala.Tuple2<K,scala.collection.Iterable<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>>>
[groupBy](../../../../org/apache/spark/rdd/RDD.html#groupBy-scala.Function1-scala.reflect.ClassTag-)(scala.Function1<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),K> f, scala.reflect.ClassTag<K> kt)
Return an RDD of grouped items.
<K> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<scala.Tuple2<K,scala.collection.Iterable<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>>>
[groupBy](../../../../org/apache/spark/rdd/RDD.html#groupBy-scala.Function1-int-scala.reflect.ClassTag-)(scala.Function1<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),K> f, int numPartitions, scala.reflect.ClassTag<K> kt)
Return an RDD of grouped elements.
<K> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<scala.Tuple2<K,scala.collection.Iterable<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>>>
[groupBy](../../../../org/apache/spark/rdd/RDD.html#groupBy-scala.Function1-org.apache.spark.Partitioner-scala.reflect.ClassTag-scala.math.Ordering-)(scala.Function1<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),K> f,[Partitioner](../../../../org/apache/spark/Partitioner.html "class in org.apache.spark") p, scala.reflect.ClassTag<K> kt, scala.math.Ordering<K> ord)
Return an RDD of grouped items.
int
[id](../../../../org/apache/spark/rdd/RDD.html#id--)()
A unique ID for this RDD (within its SparkContext).
[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>
[intersection](../../../../org/apache/spark/rdd/RDD.html#intersection-org.apache.spark.rdd.RDD-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> other)
Return the intersection of this RDD and another one.
[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>
[intersection](../../../../org/apache/spark/rdd/RDD.html#intersection-org.apache.spark.rdd.RDD-int-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> other, int numPartitions)
Return the intersection of this RDD and another one.
[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>
[intersection](../../../../org/apache/spark/rdd/RDD.html#intersection-org.apache.spark.rdd.RDD-org.apache.spark.Partitioner-scala.math.Ordering-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> other,[Partitioner](../../../../org/apache/spark/Partitioner.html "class in org.apache.spark") partitioner, scala.math.Ordering<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> ord)
Return the intersection of this RDD and another one.
boolean
[isCheckpointed](../../../../org/apache/spark/rdd/RDD.html#isCheckpointed--)()
Return whether this RDD is checkpointed and materialized, either reliably or locally.
boolean
[isEmpty](../../../../org/apache/spark/rdd/RDD.html#isEmpty--)()
scala.collection.Iterator<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>
[iterator](../../../../org/apache/spark/rdd/RDD.html#iterator-org.apache.spark.Partition-org.apache.spark.TaskContext-)([Partition](../../../../org/apache/spark/Partition.html "interface in org.apache.spark") split,[TaskContext](../../../../org/apache/spark/TaskContext.html "class in org.apache.spark") context)
Internal method to this RDD; will read from cache if applicable, or otherwise compute it.
<K> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<scala.Tuple2<K,[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>>
[keyBy](../../../../org/apache/spark/rdd/RDD.html#keyBy-scala.Function1-)(scala.Function1<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),K> f)
Creates tuples of the elements in this RDD by applying f
.
[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>
[localCheckpoint](../../../../org/apache/spark/rdd/RDD.html#localCheckpoint--)()
Mark this RDD for local checkpointing using Spark's existing caching layer.
<U> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<U>
[map](../../../../org/apache/spark/rdd/RDD.html#map-scala.Function1-scala.reflect.ClassTag-)(scala.Function1<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),U> f, scala.reflect.ClassTag<U> evidence$3)
Return a new RDD by applying a function to all elements of this RDD.
<U> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<U>
[mapPartitions](../../../../org/apache/spark/rdd/RDD.html#mapPartitions-scala.Function1-boolean-scala.reflect.ClassTag-)(scala.Function1<scala.collection.Iterator<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>,scala.collection.Iterator<U>> f, boolean preservesPartitioning, scala.reflect.ClassTag<U> evidence$6)
Return a new RDD by applying a function to each partition of this RDD.
<U> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<U>
[mapPartitionsWithEvaluator](../../../../org/apache/spark/rdd/RDD.html#mapPartitionsWithEvaluator-org.apache.spark.PartitionEvaluatorFactory-scala.reflect.ClassTag-)([PartitionEvaluatorFactory](../../../../org/apache/spark/PartitionEvaluatorFactory.html "interface in org.apache.spark")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),U> evaluatorFactory, scala.reflect.ClassTag<U> evidence$10)
Return a new RDD by applying an evaluator to each partition of this RDD.
<U> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<U>
[mapPartitionsWithIndex](../../../../org/apache/spark/rdd/RDD.html#mapPartitionsWithIndex-scala.Function2-boolean-scala.reflect.ClassTag-)(scala.Function2<Object,scala.collection.Iterator<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>,scala.collection.Iterator<U>> f, boolean preservesPartitioning, scala.reflect.ClassTag<U> evidence$9)
Return a new RDD by applying a function to each partition of this RDD, while tracking the index of the original partition.
[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")
[max](../../../../org/apache/spark/rdd/RDD.html#max-scala.math.Ordering-)(scala.math.Ordering<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> ord)
Returns the max of this RDD as defined by the implicit Ordering[T].
[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")
[min](../../../../org/apache/spark/rdd/RDD.html#min-scala.math.Ordering-)(scala.math.Ordering<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> ord)
Returns the min of this RDD as defined by the implicit Ordering[T].
String
[name](../../../../org/apache/spark/rdd/RDD.html#name--)()
A friendly name for this RDD
static <T> [DoubleRDDFunctions](../../../../org/apache/spark/rdd/DoubleRDDFunctions.html "class in org.apache.spark.rdd")
[numericRDDToDoubleRDDFunctions](../../../../org/apache/spark/rdd/RDD.html#numericRDDToDoubleRDDFunctions-org.apache.spark.rdd.RDD-scala.math.Numeric-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<T> rdd, scala.math.Numeric<T> num)
scala.Option<[Partitioner](../../../../org/apache/spark/Partitioner.html "class in org.apache.spark")>
[partitioner](../../../../org/apache/spark/rdd/RDD.html#partitioner--)()
Optionally overridden by subclasses to specify how they are partitioned.
[Partition](../../../../org/apache/spark/Partition.html "interface in org.apache.spark")[]
[partitions](../../../../org/apache/spark/rdd/RDD.html#partitions--)()
Get the array of partitions of this RDD, taking into account whether the RDD is checkpointed or not.
[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>
[persist](../../../../org/apache/spark/rdd/RDD.html#persist--)()
Persist this RDD with the default storage level (MEMORY_ONLY
).
[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>
[persist](../../../../org/apache/spark/rdd/RDD.html#persist-org.apache.spark.storage.StorageLevel-)([StorageLevel](../../../../org/apache/spark/storage/StorageLevel.html "class in org.apache.spark.storage") newLevel)
Set this RDD's storage level to persist its values across operations after the first time it is computed.
[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<String>
[pipe](../../../../org/apache/spark/rdd/RDD.html#pipe-scala.collection.Seq-scala.collection.Map-scala.Function1-scala.Function2-boolean-int-java.lang.String-)(scala.collection.Seq<String> command, scala.collection.Map<String,String> env, scala.Function1<scala.Function1<String,scala.runtime.BoxedUnit>,scala.runtime.BoxedUnit> printPipeContext, scala.Function2<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),scala.Function1<String,scala.runtime.BoxedUnit>,scala.runtime.BoxedUnit> printRDDElement, boolean separateWorkingDir, int bufferSize, String encoding)
Return an RDD created by piping elements to a forked external process.
[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<String>
[pipe](../../../../org/apache/spark/rdd/RDD.html#pipe-java.lang.String-)(String command)
Return an RDD created by piping elements to a forked external process.
[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<String>
[pipe](../../../../org/apache/spark/rdd/RDD.html#pipe-java.lang.String-scala.collection.Map-)(String command, scala.collection.Map<String,String> env)
Return an RDD created by piping elements to a forked external process.
scala.collection.Seq<String>
[preferredLocations](../../../../org/apache/spark/rdd/RDD.html#preferredLocations-org.apache.spark.Partition-)([Partition](../../../../org/apache/spark/Partition.html "interface in org.apache.spark") split)
Get the preferred locations of a partition, taking into account whether the RDD is checkpointed.
[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>[]
[randomSplit](../../../../org/apache/spark/rdd/RDD.html#randomSplit-double:A-long-)(double[] weights, long seed)
Randomly splits this RDD with the provided weights.
static <T> [AsyncRDDActions](../../../../org/apache/spark/rdd/AsyncRDDActions.html "class in org.apache.spark.rdd")<T>
[rddToAsyncRDDActions](../../../../org/apache/spark/rdd/RDD.html#rddToAsyncRDDActions-org.apache.spark.rdd.RDD-scala.reflect.ClassTag-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<T> rdd, scala.reflect.ClassTag<T> evidence$38)
static <K,V> [OrderedRDDFunctions](../../../../org/apache/spark/rdd/OrderedRDDFunctions.html "class in org.apache.spark.rdd")<K,V,scala.Tuple2<K,V>>
[rddToOrderedRDDFunctions](../../../../org/apache/spark/rdd/RDD.html#rddToOrderedRDDFunctions-org.apache.spark.rdd.RDD-scala.math.Ordering-scala.reflect.ClassTag-scala.reflect.ClassTag-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<scala.Tuple2<K,V>> rdd, scala.math.Ordering<K> evidence$39, scala.reflect.ClassTag<K> evidence$40, scala.reflect.ClassTag<V> evidence$41)
static <K,V> [PairRDDFunctions](../../../../org/apache/spark/rdd/PairRDDFunctions.html "class in org.apache.spark.rdd")<K,V>
[rddToPairRDDFunctions](../../../../org/apache/spark/rdd/RDD.html#rddToPairRDDFunctions-org.apache.spark.rdd.RDD-scala.reflect.ClassTag-scala.reflect.ClassTag-scala.math.Ordering-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<scala.Tuple2<K,V>> rdd, scala.reflect.ClassTag<K> kt, scala.reflect.ClassTag<V> vt, scala.math.Ordering<K> ord)
static <K,V> [SequenceFileRDDFunctions](../../../../org/apache/spark/rdd/SequenceFileRDDFunctions.html "class in org.apache.spark.rdd")<K,V>
[rddToSequenceFileRDDFunctions](../../../../org/apache/spark/rdd/RDD.html#rddToSequenceFileRDDFunctions-org.apache.spark.rdd.RDD-scala.reflect.ClassTag-scala.reflect.ClassTag---)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<scala.Tuple2<K,V>> rdd, scala.reflect.ClassTag<K> kt, scala.reflect.ClassTag<V> vt, <any> keyWritableFactory, <any> valueWritableFactory)
[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")
[reduce](../../../../org/apache/spark/rdd/RDD.html#reduce-scala.Function2-)(scala.Function2<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> f)
Reduces the elements of this RDD using the specified commutative and associative binary operator.
[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>
[repartition](../../../../org/apache/spark/rdd/RDD.html#repartition-int-scala.math.Ordering-)(int numPartitions, scala.math.Ordering<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> ord)
Return a new RDD that has exactly numPartitions partitions.
[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>
[sample](../../../../org/apache/spark/rdd/RDD.html#sample-boolean-double-long-)(boolean withReplacement, double fraction, long seed)
Return a sampled subset of this RDD.
void
[saveAsObjectFile](../../../../org/apache/spark/rdd/RDD.html#saveAsObjectFile-java.lang.String-)(String path)
Save this RDD as a SequenceFile of serialized objects.
void
[saveAsTextFile](../../../../org/apache/spark/rdd/RDD.html#saveAsTextFile-java.lang.String-)(String path)
Save this RDD as a text file, using string representations of elements.
void
[saveAsTextFile](../../../../org/apache/spark/rdd/RDD.html#saveAsTextFile-java.lang.String-java.lang.Class-)(String path, Class<? extends org.apache.hadoop.io.compress.CompressionCodec> codec)
Save this RDD as a compressed text file, using string representations of elements.
[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>
[setName](../../../../org/apache/spark/rdd/RDD.html#setName-java.lang.String-)(String _name)
Assign a name to this RDD
<K> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>
[sortBy](../../../../org/apache/spark/rdd/RDD.html#sortBy-scala.Function1-boolean-int-scala.math.Ordering-scala.reflect.ClassTag-)(scala.Function1<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),K> f, boolean ascending, int numPartitions, scala.math.Ordering<K> ord, scala.reflect.ClassTag<K> ctag)
Return this RDD sorted by the given key function.
[SparkContext](../../../../org/apache/spark/SparkContext.html "class in org.apache.spark")
[sparkContext](../../../../org/apache/spark/rdd/RDD.html#sparkContext--)()
The SparkContext that created this RDD.
[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>
[subtract](../../../../org/apache/spark/rdd/RDD.html#subtract-org.apache.spark.rdd.RDD-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> other)
Return an RDD with the elements from this
that are not in other
.
[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>
[subtract](../../../../org/apache/spark/rdd/RDD.html#subtract-org.apache.spark.rdd.RDD-int-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> other, int numPartitions)
Return an RDD with the elements from this
that are not in other
.
[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>
[subtract](../../../../org/apache/spark/rdd/RDD.html#subtract-org.apache.spark.rdd.RDD-org.apache.spark.Partitioner-scala.math.Ordering-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> other,[Partitioner](../../../../org/apache/spark/Partitioner.html "class in org.apache.spark") p, scala.math.Ordering<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> ord)
Return an RDD with the elements from this
that are not in other
.
Object
[take](../../../../org/apache/spark/rdd/RDD.html#take-int-)(int num)
Take the first num elements of the RDD.
Object
[takeOrdered](../../../../org/apache/spark/rdd/RDD.html#takeOrdered-int-scala.math.Ordering-)(int num, scala.math.Ordering<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> ord)
Returns the first k (smallest) elements from this RDD as defined by the specified implicit Ordering[T] and maintains the ordering.
Object
[takeSample](../../../../org/apache/spark/rdd/RDD.html#takeSample-boolean-int-long-)(boolean withReplacement, int num, long seed)
Return a fixed-size sampled subset of this RDD in an array
String
[toDebugString](../../../../org/apache/spark/rdd/RDD.html#toDebugString--)()
A description of this RDD and its recursive dependencies for debugging.
[JavaRDD](../../../../org/apache/spark/api/java/JavaRDD.html "class in org.apache.spark.api.java")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>
[toJavaRDD](../../../../org/apache/spark/rdd/RDD.html#toJavaRDD--)()
scala.collection.Iterator<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>
[toLocalIterator](../../../../org/apache/spark/rdd/RDD.html#toLocalIterator--)()
Return an iterator that contains all of the elements in this RDD.
Object
[top](../../../../org/apache/spark/rdd/RDD.html#top-int-scala.math.Ordering-)(int num, scala.math.Ordering<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> ord)
Returns the top k (largest) elements from this RDD as defined by the specified implicit Ordering[T] and maintains the ordering.
String
[toString](../../../../org/apache/spark/rdd/RDD.html#toString--)()
<U> U
[treeAggregate](../../../../org/apache/spark/rdd/RDD.html#treeAggregate-U-scala.Function2-scala.Function2-int-boolean-scala.reflect.ClassTag-)(U zeroValue, scala.Function2<U,[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),U> seqOp, scala.Function2<U,U,U> combOp, int depth, boolean finalAggregateOnExecutor, scala.reflect.ClassTag<U> evidence$35)
<U> U
[treeAggregate](../../../../org/apache/spark/rdd/RDD.html#treeAggregate-U-scala.Function2-scala.Function2-int-scala.reflect.ClassTag-)(U zeroValue, scala.Function2<U,[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),U> seqOp, scala.Function2<U,U,U> combOp, int depth, scala.reflect.ClassTag<U> evidence$34)
Aggregates the elements of this RDD in a multi-level tree pattern.
[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")
[treeReduce](../../../../org/apache/spark/rdd/RDD.html#treeReduce-scala.Function2-int-)(scala.Function2<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> f, int depth)
Reduces the elements of this RDD in a multi-level tree pattern.
[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>
[union](../../../../org/apache/spark/rdd/RDD.html#union-org.apache.spark.rdd.RDD-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> other)
Return the union of this RDD and another one.
[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>
[unpersist](../../../../org/apache/spark/rdd/RDD.html#unpersist-boolean-)(boolean blocking)
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>
[withResources](../../../../org/apache/spark/rdd/RDD.html#withResources-org.apache.spark.resource.ResourceProfile-)([ResourceProfile](../../../../org/apache/spark/resource/ResourceProfile.html "class in org.apache.spark.resource") rp)
Specify a ResourceProfile to use when calculating this RDD.
<U> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<scala.Tuple2<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),U>>
[zip](../../../../org/apache/spark/rdd/RDD.html#zip-org.apache.spark.rdd.RDD-scala.reflect.ClassTag-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<U> other, scala.reflect.ClassTag<U> evidence$13)
Zips this RDD with another one, returning key-value pairs with the first element in each RDD, second element in each RDD, etc.
<B,V> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<V>
[zipPartitions](../../../../org/apache/spark/rdd/RDD.html#zipPartitions-org.apache.spark.rdd.RDD-boolean-scala.Function2-scala.reflect.ClassTag-scala.reflect.ClassTag-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<B> rdd2, boolean preservesPartitioning, scala.Function2<scala.collection.Iterator<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>,scala.collection.Iterator<B>,scala.collection.Iterator<V>> f, scala.reflect.ClassTag<B> evidence$14, scala.reflect.ClassTag<V> evidence$15)
Zip this RDD's partitions with one (or more) RDD(s) and return a new RDD by applying a function to the zipped partitions.
<B,V> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<V>
[zipPartitions](../../../../org/apache/spark/rdd/RDD.html#zipPartitions-org.apache.spark.rdd.RDD-scala.Function2-scala.reflect.ClassTag-scala.reflect.ClassTag-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<B> rdd2, scala.Function2<scala.collection.Iterator<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>,scala.collection.Iterator<B>,scala.collection.Iterator<V>> f, scala.reflect.ClassTag<B> evidence$16, scala.reflect.ClassTag<V> evidence$17)
<B,C,V> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<V>
[zipPartitions](../../../../org/apache/spark/rdd/RDD.html#zipPartitions-org.apache.spark.rdd.RDD-org.apache.spark.rdd.RDD-boolean-scala.Function3-scala.reflect.ClassTag-scala.reflect.ClassTag-scala.reflect.ClassTag-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<B> rdd2,[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<C> rdd3, boolean preservesPartitioning, scala.Function3<scala.collection.Iterator<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>,scala.collection.Iterator<B>,scala.collection.Iterator<C>,scala.collection.Iterator<V>> f, scala.reflect.ClassTag<B> evidence$18, scala.reflect.ClassTag<C> evidence$19, scala.reflect.ClassTag<V> evidence$20)
<B,C,V> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<V>
[zipPartitions](../../../../org/apache/spark/rdd/RDD.html#zipPartitions-org.apache.spark.rdd.RDD-org.apache.spark.rdd.RDD-scala.Function3-scala.reflect.ClassTag-scala.reflect.ClassTag-scala.reflect.ClassTag-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<B> rdd2,[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<C> rdd3, scala.Function3<scala.collection.Iterator<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>,scala.collection.Iterator<B>,scala.collection.Iterator<C>,scala.collection.Iterator<V>> f, scala.reflect.ClassTag<B> evidence$21, scala.reflect.ClassTag<C> evidence$22, scala.reflect.ClassTag<V> evidence$23)
<B,C,D,V> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<V>
[zipPartitions](../../../../org/apache/spark/rdd/RDD.html#zipPartitions-org.apache.spark.rdd.RDD-org.apache.spark.rdd.RDD-org.apache.spark.rdd.RDD-boolean-scala.Function4-scala.reflect.ClassTag-scala.reflect.ClassTag-scala.reflect.ClassTag-scala.reflect.ClassTag-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<B> rdd2,[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<C> rdd3,[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<D> rdd4, boolean preservesPartitioning, scala.Function4<scala.collection.Iterator<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>,scala.collection.Iterator<B>,scala.collection.Iterator<C>,scala.collection.Iterator<D>,scala.collection.Iterator<V>> f, scala.reflect.ClassTag<B> evidence$24, scala.reflect.ClassTag<C> evidence$25, scala.reflect.ClassTag<D> evidence$26, scala.reflect.ClassTag<V> evidence$27)
<B,C,D,V> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<V>
[zipPartitions](../../../../org/apache/spark/rdd/RDD.html#zipPartitions-org.apache.spark.rdd.RDD-org.apache.spark.rdd.RDD-org.apache.spark.rdd.RDD-scala.Function4-scala.reflect.ClassTag-scala.reflect.ClassTag-scala.reflect.ClassTag-scala.reflect.ClassTag-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<B> rdd2,[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<C> rdd3,[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<D> rdd4, scala.Function4<scala.collection.Iterator<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")>,scala.collection.Iterator<B>,scala.collection.Iterator<C>,scala.collection.Iterator<D>,scala.collection.Iterator<V>> f, scala.reflect.ClassTag<B> evidence$28, scala.reflect.ClassTag<C> evidence$29, scala.reflect.ClassTag<D> evidence$30, scala.reflect.ClassTag<V> evidence$31)
<U> [RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<U>
[zipPartitionsWithEvaluator](../../../../org/apache/spark/rdd/RDD.html#zipPartitionsWithEvaluator-org.apache.spark.rdd.RDD-org.apache.spark.PartitionEvaluatorFactory-scala.reflect.ClassTag-)([RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD")> rdd2,[PartitionEvaluatorFactory](../../../../org/apache/spark/PartitionEvaluatorFactory.html "interface in org.apache.spark")<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),U> evaluatorFactory, scala.reflect.ClassTag<U> evidence$11)
Zip this RDD's partitions with another RDD and return a new RDD by applying an evaluator to the zipped partitions.
[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<scala.Tuple2<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),Object>>
[zipWithIndex](../../../../org/apache/spark/rdd/RDD.html#zipWithIndex--)()
Zips this RDD with its element indices.
[RDD](../../../../org/apache/spark/rdd/RDD.html "class in org.apache.spark.rdd")<scala.Tuple2<[T](../../../../org/apache/spark/rdd/RDD.html "type parameter in RDD"),Object>>
[zipWithUniqueId](../../../../org/apache/spark/rdd/RDD.html#zipWithUniqueId--)()
Zips this RDD with generated unique Long ids.