pandas.read_iceberg — pandas 3.0.0rc0+33.g1fd184de2a documentation (original) (raw)
pandas.read_iceberg(table_identifier, catalog_name=None, *, catalog_properties=None, row_filter=None, selected_fields=None, case_sensitive=True, snapshot_id=None, limit=None, scan_properties=None)[source]#
Read an Apache Iceberg table into a pandas DataFrame.
Added in version 3.0.0.
Warning
read_iceberg is experimental and may change without warning.
Parameters:
table_identifierstr
Table identifier.
catalog_namestr, optional
The name of the catalog.
catalog_propertiesdict of {str: str}, optional
The properties that are used next to the catalog configuration.
row_filterstr, optional
A string that describes the desired rows.
selected_fieldstuple of str, optional
A tuple of strings representing the column names to return in the output dataframe.
case_sensitivebool, default True
If True column matching is case sensitive.
snapshot_idint, optional
Snapshot ID to time travel to. By default the table will be scanned as of the current snapshot ID.
limitint, optional
An integer representing the number of rows to return in the scan result. By default all matching rows will be fetched.
scan_propertiesdict of {str: obj}, optional
Additional Table properties as a dictionary of string key value pairs to use for this scan.
Returns:
DataFrame
DataFrame based on the Iceberg table.
Examples
df = pd.read_iceberg( ... table_identifier="my_table", ... catalog_name="my_catalog", ... catalog_properties={"s3.secret-access-key": "my-secret"}, ... row_filter="trip_distance >= 10.0", ... selected_fields=("VendorID", "tpep_pickup_datetime"), ... )