PERF: speed up IntervalIndex._intersection_non_unique by ~50x by qwhelan · Pull Request #27489 · pandas-dev/pandas (original) (raw)
I've been backfilling asv
data and noticed the following regression in IntervalIndexMethod.time_intersection_both_duplicate
(see here):
This regression was missed as the benchmark was added in #26711, which was after introduction in #26225.
This PR both simplifies the IntervalIndex._intersection_non_unique
logic (now equivalent to MultiIndex._intersection_non_unique
) and provides a ~50x
speedup:
before after ratio
[9bab81e0] [2848036e]
<interval_non_unique_intersection~1> <interval_non_unique_intersection>
- 12.6±0.1ms 725±30μs 0.06 index_object.IntervalIndexMethod.time_intersection_both_duplicate(1000)
- 4.96±0s 96.7±6ms 0.02 index_object.IntervalIndexMethod.time_intersection_both_duplicate(100000)
The new numbers are about 10x
faster than the old baseline.
- closes #xxxx
- tests added / passed
- passes
black pandas
- passes
git diff upstream/master -u -- "*.py" | flake8 --diff
- whatsnew entry