When .loc returns IndexError rather than KeyError · Issue #12527 · pandas-dev/pandas (original) (raw)

Updated

I have a MultiIndex'd DataFrame that returns a KeyError for one integer and an IndexError for a different integer, neither integer is in the first level of the index. This only occurs when attempting to access a scalar value, a slice always give KeyError. The behavior does not occur on a cut down version of the (>200M when pickled) data frame, or I would attach a working example. Can send the file if needed though.

>>> isinstance(n, int)
True
>>> df.loc[(n, 0), 'dest']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.5/site-packages/pandas/core/indexing.py", line 1196, in __getitem__
    return self._getitem_tuple(key)
  File "/usr/local/lib/python3.5/site-packages/pandas/core/indexing.py", line 709, in _getitem_tuple
    return self._getitem_lowerdim(tup)
  File "/usr/local/lib/python3.5/site-packages/pandas/core/indexing.py", line 817, in _getitem_lowerdim
    return self._getitem_nested_tuple(tup)
  File "/usr/local/lib/python3.5/site-packages/pandas/core/indexing.py", line 889, in _getitem_nested_tuple
    obj = getattr(obj, self.name)._getitem_axis(key, axis=axis)
  File "/usr/local/lib/python3.5/site-packages/pandas/core/indexing.py", line 1343, in _getitem_axis
    return self._get_label(key, axis=axis)
  File "/usr/local/lib/python3.5/site-packages/pandas/core/indexing.py", line 86, in _get_label
    return self.obj._xs(label, axis=axis)
  File "/usr/local/lib/python3.5/site-packages/pandas/core/generic.py", line 1483, in xs
    drop_level=drop_level)
  File "/usr/local/lib/python3.5/site-packages/pandas/core/index.py", line 5432, in get_loc_level
    return (self._engine.get_loc(_values_from_object(key)),
  File "pandas/index.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandas/index.c:3979)
  File "pandas/index.pyx", line 146, in pandas.index.IndexEngine.get_loc (pandas/index.c:3693)
  File "pandas/src/util.pxd", line 41, in util.get_value_at (pandas/index.c:13199)
IndexError: index out of bounds

Expected Output

>>> df.loc[(m, 0), 'dest']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.5/site-packages/pandas/core/indexing.py", line 1196, in __getitem__
    return self._getitem_tuple(key)
  File "/usr/local/lib/python3.5/site-packages/pandas/core/indexing.py", line 709, in _getitem_tuple
    return self._getitem_lowerdim(tup)
  File "/usr/local/lib/python3.5/site-packages/pandas/core/indexing.py", line 817, in _getitem_lowerdim
    return self._getitem_nested_tuple(tup)
  File "/usr/local/lib/python3.5/site-packages/pandas/core/indexing.py", line 889, in _getitem_nested_tuple
    obj = getattr(obj, self.name)._getitem_axis(key, axis=axis)
  File "/usr/local/lib/python3.5/site-packages/pandas/core/indexing.py", line 1343, in _getitem_axis
    return self._get_label(key, axis=axis)
  File "/usr/local/lib/python3.5/site-packages/pandas/core/indexing.py", line 86, in _get_label
    return self.obj._xs(label, axis=axis)
  File "/usr/local/lib/python3.5/site-packages/pandas/core/generic.py", line 1483, in xs
    drop_level=drop_level)
  File "/usr/local/lib/python3.5/site-packages/pandas/core/index.py", line 5432, in get_loc_level
    return (self._engine.get_loc(_values_from_object(key)),
  File "pandas/index.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandas/index.c:3979)
  File "pandas/index.pyx", line 147, in pandas.index.IndexEngine.get_loc (pandas/index.c:3719)
KeyError: (300067502, 0)

The expected behavior occurs for nearly all integers I try that are not in the first level of the index. How could special integers give an IndexError?

output of pd.show_versions()

commit: None
python: 3.5.1.final.0
python-bits: 64
OS: Darwin
OS-release: 15.3.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.17.0
nose: 1.3.7
pip: 8.0.2
setuptools: 19.4
Cython: 0.23.4
numpy: 1.10.4
scipy: 0.16.1
statsmodels: None
IPython: 4.0.0
sphinx: None
patsy: 0.4.0
dateutil: 2.4.2
pytz: 2015.7
blosc: None
bottleneck: None
tables: None
numexpr: 2.4.6
matplotlib: 1.5.0
openpyxl: None
xlrd: 0.9.4
xlwt: None
xlsxwriter: None
lxml: None
bs4: 4.4.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.11
pymysql: None
psycopg2: None