API: label-based slicing with not-included labels · Issue #8613 · pandas-dev/pandas (original) (raw)
I didn't directly find an issue about it, or an explanation in the docs, but I stumbled today on the following, which did surprise me a bit:
Considering the following dataframe:
In [18]: df = pd.DataFrame(np.random.randn(5,2), index=pd.date_range('2012-01-01', periods=5))
In [19]: df
Out[19]:
0 1
2012-01-01 2.511337 -0.776326
2012-01-02 0.133589 0.441911
2012-01-03 0.348167 1.285188
2012-01-04 1.075843 1.282131
2012-01-05 0.683006 0.558459
Slicing with a label that is not included in the index works with .ix
, but not with .loc
:
In [20]: df.ix['2012-01-03':'2012-01-31']
Out[20]:
0 1
2012-01-03 0.348167 1.285188
2012-01-04 1.075843 1.282131
2012-01-05 0.683006 0.558459
In [21]: df.loc['2012-01-03':'2012-01-31']
...
KeyError: 'stop bound [2012-01-31] is not in the [index]'
Context: I was updating some older code, and I wanted to replace .ix
with .loc
(as this is what we recommend if it is purely label based to prevent confusion).
Some things:
- If this is intended, I don't find this stated somewhere in the docs. So the docs are at least lacking at this point.
- the inconsistency between
[]
,.ix[]
and.loc[]
is a bit surprising here - it is also inconsistent with
iloc
-> that behaviour was changed in 0.14 to allow out of bound slicing (http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#whatsnew-0140-api) - Specifically for datetime-line indexing, it is also inconsistent with the feature of partial string indexing:
df.loc['2012-01-03':'2012-01']
will work and do the expected whiledf.loc['2012-01-03':'2012-01-31']
fails