BUG/PERF: MultiIndex setops with sort=None by lukemanley · Pull Request #49010 · pandas-dev/pandas (original) (raw)
Simplify and improve perf in algos.safe_sort which improves perf in MultiIndex setops when sort=None.
before after ratio
[55dc3243] [f00149ce]
<main> <safe-sort-multiindex>
- 73.3±2ms 20.2±0.4ms 0.28 multiindex_object.SetOperations.time_operation('monotonic', 'ea_int', 'intersection', None)
- 71.0±2ms 19.0±0.6ms 0.27 multiindex_object.SetOperations.time_operation('monotonic', 'int', 'intersection', None)
- 104±0.7ms 24.0±3ms 0.23 multiindex_object.SetOperations.time_operation('non_monotonic', 'string', 'intersection', None)
- 98.7±2ms 21.4±0.3ms 0.22 multiindex_object.SetOperations.time_operation('monotonic', 'string', 'intersection', None)
- 114±0.5ms 24.7±0.4ms 0.22 multiindex_object.SetOperations.time_operation('non_monotonic', 'ea_int', 'intersection', None)
- 109±0.7ms 21.3±2ms 0.20 multiindex_object.SetOperations.time_operation('non_monotonic', 'int', 'intersection', None)
- 99.3±1ms 18.5±2ms 0.19 multiindex_object.SetOperations.time_operation('monotonic', 'datetime', 'intersection', None)
- 157±6ms 21.4±2ms 0.14 multiindex_object.SetOperations.time_operation('non_monotonic', 'datetime', 'intersection', None)
I updated the expected value in test_union_nan_got_duplicated
which was added in #38977.
MultiIndex.union
sorts by default but the expected value in that test was not sorted.