Overview of [] (getitem) API · Issue #9595 · pandas-dev/pandas (original) (raw)
some examples (on Series only) in #12890
I started making an overview of the indexing semantics with http://nbviewer.ipython.org/gist/jorisvandenbossche/7889b389a21b41bc1063 (only for series/frame, not for panel)
Conclusion: it is mess :-)
Summary for slicing
- Slicing with integer labels is:
- always integer location based
- except for a float indexer where it is label based
- Slicing with other types of labels is always label based if it is of appropriate type for the indexer.
So, you can say that the behaviour is equivalent to .ix
, except that the behaviour for integer labels is different for integer indexers (swapped). (For .ix
, when having an integer axis, it is always label based and no fallback to integer location based).
Summary for single label
- Indexing with a single label is always label based
- But, there is fallback to integer location based, except for integer and float indexers
Summary for indexing with list of labels
- It is primarily label based, but:
- There is fallback to integer location based apart from int/float integer axis
- It is a pure reindex, also if no label of the list is found, you just get an all NaN series (which contrasts with loc, where at least one label should be found)
- String parsing for a datetime index does not seem to work
This mainly follows ix
, apart from points 2 and 3
Summary for boolean indexing
- This is simple, it just works as expected
Summary for DataFrames
- It uses the 'information' axis (axis 1) for:
- single labels
- list of labels
- It uses the rows (axis 0) for:
- slicing
- boolean indexing
This is as documented (only the boolean case is not explicitely documented I think).
For the rest (on the choses axis), it follows the same semantics as []
on a series, but:
- for a list of labels, now all labels must be present (no pure reindex as with series)
- for single labels: no fallback to integer location based for non-numeric index (but this does fallback for a list of labels ...)
Questions are here:
- Are there things we can change? (that would not be too disruptive .. maybe not?) And want change?
- How do we document this best?
- Now you have the "basics" section (http://pandas.pydata.org/pandas-docs/stable/indexing.html#basics) and the slicing section (http://pandas.pydata.org/pandas-docs/stable/indexing.html#slicing-ranges), but this does not cover all cases at all.