ENH: Improve error message when specifying dtype="float32[pyarrow]" while PyArrow is not installed · Issue #57928 · pandas-dev/pandas (original) (raw)
Feature Type
- Adding new functionality to pandas
- Changing existing functionality in pandas
- Removing existing functionality in pandas
Problem Description
If PyArrow is not installed properly, running the following code snippet from the User Guide:ser = pd.Series([-1.5, 0.2, None], dtype="float32[pyarrow]")
will result in:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File ".../python3.12/site-packages/pandas/core/series.py", line 493, in __init__
dtype = self._validate_dtype(dtype)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".../python3.12/site-packages/pandas/core/generic.py", line 515, in _validate_dtype
dtype = pandas_dtype(dtype)
^^^^^^^^^^^^^^^^^^^
File ".../python3.12/site-packages/pandas/core/dtypes/common.py", line 1624, in pandas_dtype
result = registry.find(dtype)
^^^^^^^^^^^^^^^^^^^^
File ".../python3.12/site-packages/pandas/core/dtypes/base.py", line 576, in find
return dtype_type.construct_from_string(dtype)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".../python3.12/site-packages/pandas/core/dtypes/dtypes.py", line 2251, in construct_from_string
pa_dtype = pa.type_for_alias(base_type)
^^
NameError: name 'pa' is not defined
which is not very informative.
Feature Description
We can improve the error message by letting the user know that there is something wrong regarding the installation of PyArrow, especially when the user believes that he/she has installed PyArrow, but actually installed it in a wrong location or installed an outdated version.
Alternative Solutions
Catch the NameError
and raise another ImportError
from it describing what happened.
Additional Context
This conforms to the description in Installation Guide: If the optional dependency is not installed, pandas will raise an ImportError
when the method requiring that dependency is called.