DOC: slicing with labels when index has duplicate labels · Issue #36251 · pandas-dev/pandas (original) (raw)

Location of the documentation

https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#slicing-ranges

Note: You can check the latest versions of the docs on master here.

Documentation problem

Documentation doesn't specify what happens when you slice an index that has duplicates. Does it denote the range from the first occurrence of the start element, to the last occurrence of the end element?

Suggested fix for documentation

Clarify how slicing interacts with non-unique index elements.

I would also emphasize, at
https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#slicing-with-labels
that the actual values of the index labels, and their natural comparison order and the elements between them in that natural order, do not matter for slicing; i.e. that a slice of 'Z':'A' is not empty but depends on the positions that the labels denote in the array. In all current documentation examples, the label denoting the start of the slice end also happens to be lexicographically smaller than the label denoting the end of the slice, which can mislead readers into thinking the natural order of the labels has some meaning, when all they do is denote integer positions.