API: Index.str follow-ups (extract/get_dummies) · Issue #9980 · pandas-dev/pandas (original) (raw)

Follow-ups for #9667. Noticed 2 methods which can return DataFrame.

1. Index.str.extract

As shown in docstring, it returns DataFrame when the expression has 2 or more groups.

pd.Series(['a1', 'b2', 'c3']).str.extract('[ab](\d)')
#0      1
#1      2
#2    NaN
# dtype: object

pd.Series(['a1', 'b2', 'c3']).str.extract('([ab])(\d)')
#      0    1
#0    a    1
#1    b    2
#2  NaN  NaN

Currently, Index.str.extract raises an error in both cases. I think 1st case should return Index, and 2nd case should raise understandable error.

pd.Index(['a1', 'b2', 'c3']).str.extract('[ab](\d)')
# AttributeError: 'Index' object has no attribute 'index'

pd.Index(['a1', 'b2', 'c3']).str.extract('([ab])(\d)')
# AttributeError: 'Index' object has no attribute 'empty'

2. Index.str.get_dummies

Because it returns DataFrame, should raise an understandable error.

pd.Index(['a1', 'b2', 'c3']).str.get_dummies()
# AttributeError: 'Index' object has no attribute 'fillna'

CC: @mortada