str.cat should return categorical data for categorical caller · Issue #20845 · pandas-dev/pandas (original) (raw)

The str.cat-accessor works for Series and Index, and returns an object of the corresponding type:

s = pd.Series(['a', 'b', 'a'])
t = pd.Index(['a', 'b', 'a'])
## all of the following return the same Series
s.str.cat(s)
s.str.cat(t)
s.str.cat(s.values)
s.str.cat(list(s))
# 0    aa
# 1    bb
# 2    aa
# dtype: object

## all of the following return the same Index
t.str.cat(s)
t.str.cat(t)
t.str.cat(s.values)
t.str.cat(list(s))
# Index(['aa', 'bb', 'aa'], dtype='object')

But the data loses its property of being a category after str.cat, which is inconsistent, IMO

sc = s.astype('category')
tc = pd.Index(['a', 'b', 'a'], dtype='category') # conversion does not work, see #20843
sc.str.cat(s)
# 0    aa
# 1    bb
# 2    aa
# dtype: object
## as opposed to:
sc.str.cat(s).astype('category')
# 0    aa
# 1    bb
# 2    aa
# dtype: category
# Categories (2, object): [aa, bb]
tc.str.cat(s) # crashes, see # 20842

xref #20842 #20843