pyspark.RDD.keys β€” PySpark 3.5.5 documentation (original) (raw)

RDD. keys() β†’ pyspark.rdd.RDD[K][source]ΒΆ

Return an RDD with the keys of each tuple.

New in version 0.7.0.

Returns

RDD

a RDD only containing the keys

Examples

rdd = sc.parallelize([(1, 2), (3, 4)]).keys() rdd.collect() [1, 3]