BUG: Series.xs() inconsistent with DataFrame.xs() with MultiIndex · Issue #5684 · pandas-dev/pandas (original) (raw)
Edited to clarify the bug
Series.xs
slice fails with string index labels and MultiIndex:
In [1]: idx = pd.MultiIndex.from_tuples([('a', 'one'), ('a', 'two'), ('b', 'one'), ('b', 'two')])
In [2]: df = pd.Series(np.random.randn(4), index=idx)
In [4]: df.index.set_names(['L1', 'L2'], inplace=True)
In [5]: df Out[5]: L1 L2 a one -0.136418 two -0.346941 b one -1.468534 two 1.217693 dtype: float64
In [6]: df.xs('one', level='L2')
KeyError Traceback (most recent call last) in () ----> 1 df.xs('one', level='L2')
/Users/tom/Envs/pandas-dev/lib/python2.7/site-packages/pandas-0.13.0rc1_27_g4d5ca5c-py2.7-macosx-10.8-x86_64.egg/pandas/core/series.pyc in _xs(self, key, axis, level, copy) 437 438 def _xs(self, key, axis=0, level=None, copy=True): --> 439 return self.getitem(key) 440 441 xs = _xs
/Users/tom/Envs/pandas-dev/lib/python2.7/site-packages/pandas-0.13.0rc1_27_g4d5ca5c-py2.7-macosx-10.8-x86_64.egg/pandas/core/series.pyc in getitem(self, key) 482 def getitem(self, key): 483 try: --> 484 return self.index.get_value(self, key) 485 except InvalidIndexError: 486 pass
/Users/tom/Envs/pandas-dev/lib/python2.7/site-packages/pandas-0.13.0rc1_27_g4d5ca5c-py2.7-macosx-10.8-x86_64.egg/pandas/core/index.pyc in get_value(self, series, key) 2294 raise InvalidIndexError(key) 2295 else: -> 2296 raise e1 2297 except Exception: # pragma: no cover 2298 raise e1
KeyError: 'one'
The same slice works on a DataFrame.
Previous post below:
In [12]: idx = pd.MultiIndex.from_tuples([('a', 0), ('a', 1), ('b', 0), ('b', 1)])
In [13]: df = pd.Series(np.random.randn(4), index=idx)
In [14]: df Out[14]: a 0 0.876121 1 0.638050 b 0 0.965934 1 1.061716 dtype: float64
In [15]: df.xs(0, level=1) # returns scaler Out[15]: 0.87612104445620753
In [16]: df.index.names = ['L1', 'L2']
In [27]: df.xs(0, level='L2') # returns scaler Out[27]: -0.98585685847339011
In [28]: df.xs(0, level='L1') # No key error Out[28]: -0.98585685847339011
Works for DataFrames:
In [30]: df.xs(0, level='L2')
Out[30]:
0
L1
a -0.985857
b 0.648114
[2 rows x 1 columns]
Series.xs
also seems to fail on string index labels?
In [50]: idx = pd.MultiIndex.from_tuples([('a', 'one'), ('a', 'two'), ('b', 'one'), ('b', 'two')]) In [53]: df = pd.Series(np.random.randn(4), index=idx) In [56]: df.xs('one', level='L2')
KeyError Traceback (most recent call last) in () ----> 1 df.xs('one', level='L2')
/Users/tom/Envs/pandas-dev/lib/python2.7/site-packages/pandas-0.13.0rc1_27_g4d5ca5c-py2.7-macosx-10.8-x86_64.egg/pandas/core/series.pyc in _xs(self, key, axis, level, copy) 437 438 def _xs(self, key, axis=0, level=None, copy=True): --> 439 return self.getitem(key) 440 441 xs = _xs
/Users/tom/Envs/pandas-dev/lib/python2.7/site-packages/pandas-0.13.0rc1_27_g4d5ca5c-py2.7-macosx-10.8-x86_64.egg/pandas/core/series.pyc in getitem(self, key) 482 def getitem(self, key): 483 try: --> 484 return self.index.get_value(self, key) 485 except InvalidIndexError: 486 pass
/Users/tom/Envs/pandas-dev/lib/python2.7/site-packages/pandas-0.13.0rc1_27_g4d5ca5c-py2.7-macosx-10.8-x86_64.egg/pandas/core/index.pyc in get_value(self, series, key) 2294 raise InvalidIndexError(key) 2295 else: -> 2296 raise e1 2297 except Exception: # pragma: no cover 2298 raise e1
KeyError: 'one'
So I guess this is about 3 errors on Series.xs
(possibly related?):
- Returning scalers when it should return a Series when the label is an integer
- Not raising key errors when the label is an integer
- Failing on slices for
level>1
when the label is a string.
EDIT: Oh, and I know that .loc
/ .ix
will work for these. I was just surprised by the results.