pyspark.sql.DataFrame.toArrow — PySpark 4.1.0 documentation (original) (raw)
Returns the contents of this DataFrame as PyArrow pyarrow.Table.
This is only available if PyArrow is installed and available.
New in version 4.0.0.
Notes
This method should only be used if the resulting PyArrow pyarrow.Table is expected to be small, as all the data is loaded into the driver’s memory.
This API is a developer API.
Examples
df = spark.createDataFrame([(2, "Alice"), (5, "Bob")], schema=["age", "name"]) df.coalesce(1).toArrow() pyarrow.Table age: int64 name: string
age: [[2,5]] name: [["Alice","Bob"]]