API/BUG: allow .str-accessor for 1-level MultiIndex? Return what? · Issue #23679 · pandas-dev/pandas (original) (raw)

if isinstance(data, MultiIndex) and data.nlevels > 1:
    raise ...

Meaning the constructor passes for MultiIndex with a single level, but essentially all methods fail or produce garbage:

idx = pd.Index(['aaa', 'bb', 'c'])
mi = pd.MultiIndex.from_arrays([idx])
>>> mi.str.len()
Int64Index([1, 1, 1], dtype='int64')  # compare idx.str.len() == Int64Index([3, 2, 1], dtype='int64')
>>> mi.str.cat()
[...]
NotImplementedError: initializing a Series from a MultiIndex is not supported
>>> mi.str.startswith('a')
Float64Index([nan, nan, nan], dtype='float64')
>>> mi.str.upper()
Float64Index([nan, nan, nan], dtype='float64')
>>> mi.str.islower()
Float64Index([nan, nan, nan], dtype='float64')
>>> mi.str.split()
Float64Index([nan, nan, nan], dtype='float64')
>>> mi.str.find('a')
Float64Index([nan, nan, nan], dtype='float64')
>>> mi.str.ljust(10)
Float64Index([nan, nan, nan], dtype='float64')
>>> mi.str.repeat(3)
Float64Index([nan, nan, nan], dtype='float64')
>>> mi.str.slice(1, 2)
Index([(), (), ()], dtype='object')  # compare idx.str.slice(1, 2) == Index(['a', 'b', ''], dtype='object')
>>> mi.str.zfill(10)
Float64Index([nan, nan, nan], dtype='float64')
>>> mi.str.wrap(2)
Float64Index([nan, nan, nan], dtype='float64')
>>> mi.str.normalize('NFC')
Float64Index([nan, nan, nan], dtype='float64')
>>> mi.str.index('')
[...]
ValueError: tuple.index(x): x not in tuple
>>> mi.str.get(1)
Float64Index([nan, nan, nan], dtype='float64')
>>> mi.str.contains('a')
Float64Index([nan, nan, nan], dtype='float64')

My original plan in #23167 was just to disable MultiIndex.str regardless of the number of levels, but @toobaz brought up the point (in a side discussion in #23670) that:

This would, of course, work without problem. The main question that arises from this issue:

PS. As another link to #23670, one could maybe consider enabling .str for all MultiIndex, by operating on MultiIndex.to_flat_index() in those cases. This might be interesting for example for easy joining of the MultiIndex-levels with .str.join.