API: allow the iloc indexer to run off the end and not raise IndexError (GH6296) by jreback · Pull Request #6299 · pandas-dev/pandas (original) (raw)

I've hit an issue with code from this ticket when debugging #6370 and I might have an objection.

I agree that slices should behave as they do outside pandas, i.e. those that go outside container indices should be silently bounded, i.e. something along the lines of (UPD: fixed the code a bit)

start, stop, step = s.start, s.stop, s.step length = len(obj)

if start < 0: start = max(length - start, 0) elif start > length: start = length

if stop < 0: stop = max(length - stop, 0) elif stop > length: stop = length

(there's actually a slice.indices(len(obj)) function which does exactly that, but that's not the point).

The point is that silently dropping invalid integer indexers, as in df[[1000, 5000, 10000]] might be counter-intuitive to people who come from numpy world (it is for me, at least). Just as it was for people with non-pandas background in python to find out that slicing raises IndexError on out-of-bounds start/stop values.

I've read that this ticket helped with #6301, is there a way to leave only slice bounding and drop integer index bounding without causing a regression there?