Clean multiindex keys by toobaz · Pull Request #15615 · pandas-dev/pandas (original) (raw)

This is related to 05d70f4. Before:

In [2]: s = pd.Series(range(9), index=pd.MultiIndex.from_product([['A', 'B', 'C'], ['foo', 'bar', 'baz']],names=['one', 'two']))

In [3]: s.loc._getitem_iterable(['A', 'E']) Out[3]: one two A foo 0 bar 1 baz 2 dtype: int64

In [4]: pd.util.print_versions.get_sys_info()[0] Out[4]: ('commit', '998c801f76256990b98d3f0d2ad885ae27c955a1')

After:

In [2]: s = pd.Series(range(9), index=pd.MultiIndex.from_product([['A', 'B', 'C'], ['foo', 'bar', 'baz']],names=['one', 'two']))

In [3]: s.loc._getitem_iterable(['A', 'E'])

KeyError Traceback (most recent call last) in () ----> 1 s.loc._getitem_iterable(['A', 'E'])

/home/nobackup/repo/pandas/pandas/core/indexing.py in _getitem_iterable(self, key, axis) 1090 # if it cannot handle 1091 indexer, keyarr = labels._convert_listlike_indexer( -> 1092 key, kind=self.name) 1093 if indexer is not None: 1094 return self.obj.take(indexer, axis=axis)

/home/nobackup/repo/pandas/pandas/indexes/multi.py in _convert_listlike_indexer(self, keyarr, kind) 1598 mask = check == -1 1599 if mask.any(): -> 1600 raise KeyError('%s not in index' % keyarr[mask]) 1601 1602 return indexer, keyarr

KeyError: "['E'] not in index"

In [4]: pd.util.print_versions.get_sys_info()[0] Out[4]: ('commit', '05d70f4e617a274813bdb02db69143b5554aa106')

My patch is good as before, but your new code is not coherent with our "desired" behavior for indexers.

Now, I assumed that proceeding step by step was the best way to move forward, but my assumption was clearly wrong if while I patiently wait for my trivial patch to get merged, you do find the time to completely rewrite related indexing methods (deleting a line I had explicitly mentioned as important when submitting my PR - all this in a commit which appears as "DOC: ..." in the logs, just to make it easier to spot it).

So, considering that the changes you made ironically go in a direction I like, we'll probably rather have the "famous" discussion first and then come back on this. I'm opening an issue for this in a minute.