ENH: Add sort parameter to set operations for some Indexes and adjust… by reidy-p · Pull Request #24521 · pandas-dev/pandas (original) (raw)
… tests
- Progress towards ENH: Add sort parameter to other set operations if possible #24471
- tests added / passed
- passes
git diff upstream/master -u -- "*.py" | flake8 --diff
- whatsnew entry
This PR makes some progress towards adding a sort
parameter with a default of True
to the set operations (union
, intersection
, difference
and symmetric_difference
) on Index
classes. Adding this parameter into every type of Index
would result in a very big PR so I have decided to break it up and try to add it in stages. I have tried to focus on Index
, MultiIndex
, PeriodIndex
and Intervalndex
in this PR but have made some very small changes to set operations on other indices where necessary.
Some issues to consider:
- I'm not sure whether it will be possible to control the sorting behaviour of the results of all of the set operations for all of the
Index
types. For example, because of the way some of the set operations are implemented on someIndex
types the results may always be sorted even ifsort=False
is passed (e.g., aunion
operation onDatetimeIndex
s may always return a sorted result in some cases even ifsort=False
). Perhaps this is not really a problem as long as we document it. - There are some other corner cases that always ignore the
sort
parameter at the moment. For example, theintersection
of an unsortedIndex
with itself will return the original unsortedIndex
even ifsort=True
is passed because thesort
parameter is simply ignored in this case. Similarly, theintersection
of an unsortedIndex
with an emptyIndex
will also return the original unsortedIndex
even ifsort=True
. There is similar behaviour for the other set operations.