ENH: Adding support for calamine as Excel reader engine · Issue #50395 · pandas-dev/pandas (original) (raw)
Feature Type
- Adding new functionality to pandas
- Changing existing functionality in pandas
- Removing existing functionality in pandas
Problem Description
Reading Excel files in Pandas is considerably slower than in some alternative data frame tools, for example the readxl
package in R can read Excel files much faster. The Rust calamine
library can read Excel files much faster than other engines supported by Pandas, and there is an existing Python binding to it, python-calamine. I would like to request that Pandas add official support for calamine
, so that users can read an Excel file like:
pd.read_excel("test.xlsx", engine="calamine")
Feature Description
The python-calamine
package already implements code that enables the calamine engine in Pandas, see the examples using pandas_monkeypatch()
at the bottom of their Github README. The code to enable this is here
Although python-calamine
already implements the necessary features to use the library with Pandas, I am unclear on how similar the behavior is between calamine
and other engines that Pandas supports like openpyxl
. I am hoping that by bringing calamine
in as an officially supported engine that Pandas unit tests will confirm consistent behavior across calamine
and other engines.
Alternative Solutions
None
Additional Context
No response