API: label-based slicing with not-included labels · Issue #8613 · pandas-dev/pandas (original) (raw)

I didn't directly find an issue about it, or an explanation in the docs, but I stumbled today on the following, which did surprise me a bit:

Considering the following dataframe:

In [18]: df = pd.DataFrame(np.random.randn(5,2), index=pd.date_range('2012-01-01', periods=5))

In [19]: df
Out[19]:
                   0         1
2012-01-01  2.511337 -0.776326
2012-01-02  0.133589  0.441911
2012-01-03  0.348167  1.285188
2012-01-04  1.075843  1.282131
2012-01-05  0.683006  0.558459

Slicing with a label that is not included in the index works with .ix, but not with .loc:

In [20]: df.ix['2012-01-03':'2012-01-31']
Out[20]:
                   0         1
2012-01-03  0.348167  1.285188
2012-01-04  1.075843  1.282131
2012-01-05  0.683006  0.558459

In [21]: df.loc['2012-01-03':'2012-01-31']
...
KeyError: 'stop bound [2012-01-31] is not in the [index]'

Context: I was updating some older code, and I wanted to replace .ix with .loc (as this is what we recommend if it is purely label based to prevent confusion).

Some things:

If this is intended, I don't find this stated somewhere in the docs. So the docs are at least lacking at this point.
the inconsistency between [], .ix[] and .loc[] is a bit surprising here
it is also inconsistent with iloc -> that behaviour was changed in 0.14 to allow out of bound slicing (http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#whatsnew-0140-api)
Specifically for datetime-line indexing, it is also inconsistent with the feature of partial string indexing: df.loc['2012-01-03':'2012-01'] will work and do the expected while df.loc['2012-01-03':'2012-01-31'] fails