BUG: repr of DataFrame fails when using ArrowDtype with arrow extension type · Issue #54062 · pandas-dev/pandas (original) (raw)
Using pandas latest main branch ('2.1.0.dev0+1159.gc126eeb392' at the time of writing) and pyarrow 12.0.1, when you construct a column using the ArrowDtype
extension dtype where the pyarrow type is a pyarrow extension type, the repr of DataFrame fails:
In [1]: ext_arr = pa.array(pd.period_range("2012", periods=3))
In [2]: df = pa.table({'col': ext_arr}).to_pandas(types_mapper=pd.ArrowDtype)
In [3]: df
Out[3]:
In [4]: df["col"]
Out[4]:
0 15340
1 15341
2 15342
Name: col, dtype: extension<pandas.period<ArrowPeriodType>>[pyarrow]
So it appears to be just empty, while the repr of the Series works correctly. This empty output is the result of it actually failing (does python catch that when displaying?), when calling repr
explicitly:
In [6]: repr(df)
---------------------------------------------------------------------------
NotImplementedError Traceback (most recent call last)
Cell In[6], line 1
----> 1 repr(df)
File ~/scipy/pandas/pandas/core/frame.py:1100, in DataFrame.__repr__(self)
1097 return buf.getvalue()
1099 repr_params = fmt.get_dataframe_repr_params()
-> 1100 return self.to_string(**repr_params)
...
File ~/scipy/pandas/pandas/io/formats/format.py:947, in DataFrameFormatter._get_formatted_column_labels(self, frame)
945 fmt_columns = columns.format()
946 dtypes = self.frame.dtypes
--> 947 need_leadsp = dict(zip(fmt_columns, map(is_numeric_dtype, dtypes)))
948 str_columns = [
949 [" " + x if not self._get_formatter(i) and need_leadsp[x] else x]
950 for i, x in enumerate(fmt_columns)
951 ]
952 # self.str_columns = str_columns
File ~/scipy/pandas/pandas/core/dtypes/common.py:1104, in is_numeric_dtype(arr_or_dtype)
1066 def is_numeric_dtype(arr_or_dtype) -> bool:
1067 """
1068 Check whether the provided array or dtype is of a numeric dtype.
1069
(...)
1102 False
1103 """
-> 1104 return _is_dtype_type(
1105 arr_or_dtype, _classes_and_not_datetimelike(np.number, np.bool_)
1106 ) or _is_dtype(
1107 arr_or_dtype, lambda typ: isinstance(typ, ExtensionDtype) and typ._is_numeric
1108 )
File ~/scipy/pandas/pandas/core/dtypes/common.py:1459, in _is_dtype_type(arr_or_dtype, condition)
1456 return condition(type(None))
1458 try:
-> 1459 tipo = pandas_dtype(arr_or_dtype).type
1460 except (TypeError, ValueError):
1461 if is_scalar(arr_or_dtype):
File ~/scipy/pandas/pandas/core/dtypes/dtypes.py:2106, in ArrowDtype.type(self)
2104 return type(pa_type)
2105 else:
-> 2106 raise NotImplementedError(pa_type)
NotImplementedError: extension<pandas.period<ArrowPeriodType>>