BUG: Load ORC-format data failed when pandas version>1.2.0.dev0 · Issue #40918 · pandas-dev/pandas (original) (raw)


Code Sample, a copy-pastable example

... import pandas as pd orc_data = pd.read_orc(orc_file_path)

Problem description

Pandas uses PyArrow package to load ORC/Parquet data.

For the orc data format, it will use pyarrow.orc.ORCFile to read data (orc.py), but the PyArrow does not declare orc in __init__.py file, so pandas will raise an AttributeError: module 'pyarrow' has no attribute 'orc'

image

This bug will occur if the Pandas version is greater than v1.2.0.dev0(after commit-6d1541e). Before that, pandas/io/orc.py will declare import pyarrow.orc before uses pyarrow to load orc data(v1.1.5/pandas/io.orc.py/).


Testing environment: