BUG: MultiIndex.union dropping duplicates from result by phofl · Pull Request #38977 · pandas-dev/pandas (original) (raw)
Unfortunately this looks like a pretty high performance penalty
before after ratio
[84d9c5ed] [f77120cc]
<multiindex_union~1^2> <multiindex_union>
+ 77.4±2ms 159±4ms 2.05 multiindex_object.SetOperations.time_operation('non_monotonic', 'int', 'union')
+ 33.4±0.5ms 57.9±1ms 1.73 multiindex_object.GetLoc.time_large_get_loc_warm
+ 102±1ms 152±2ms 1.49 multiindex_object.SetOperations.time_operation('non_monotonic', 'string', 'union')
+ 99.4±0.9ms 144±1ms 1.45 multiindex_object.SetOperations.time_operation('monotonic', 'string', 'union')
+ 520±4ms 640±20ms 1.23 multiindex_object.SetOperations.time_operation('non_monotonic', 'datetime', 'union')