PERF: MultiIndex set and indexing operations by lukemanley · Pull Request #53955 · pandas-dev/pandas (original) (raw)

Avoids "densifying" the levels of a MultiIndex for various set and indexing ops (MultiIndex.get_indexer_for).

> asv continuous -f 1.1 upstream/main mi-set-ops -b ^multiindex_object

       before           after         ratio
     [6eb59b32]       [30e990fd]
     <main>           <mi-set-ops>
-      20.1±0.5ms         17.5±1ms     0.87  multiindex_object.SetOperations.time_operation('monotonic', 'datetime', 'symmetric_difference', False)
-        57.3±4ms       49.3±0.9ms     0.86  multiindex_object.SetOperations.time_operation('non_monotonic', 'string', 'intersection', None)
-        20.5±1ms       17.6±0.6ms     0.86  multiindex_object.SetOperations.time_operation('non_monotonic', 'datetime', 'symmetric_difference', None)
-        19.1±2ms       16.4±0.4ms     0.86  multiindex_object.SetOperations.time_operation('non_monotonic', 'ea_int', 'symmetric_difference', False)
-        19.3±2ms      16.5±0.05ms     0.85  multiindex_object.SetOperations.time_operation('non_monotonic', 'datetime', 'symmetric_difference', False)
-        20.4±1ms       17.1±0.5ms     0.84  multiindex_object.SetOperations.time_operation('non_monotonic', 'int', 'symmetric_difference', None)
-        21.9±1ms       18.4±0.5ms     0.84  multiindex_object.SetOperations.time_operation('monotonic', 'datetime', 'intersection', False)
-      48.0±0.4ms       39.8±0.5ms     0.83  multiindex_object.SetOperations.time_operation('non_monotonic', 'string', 'union', None)
-        21.0±1ms       17.4±0.4ms     0.83  multiindex_object.SetOperations.time_operation('monotonic', 'ea_int', 'symmetric_difference', None)
-        50.8±3ms         41.8±1ms     0.82  multiindex_object.SetOperations.time_operation('monotonic', 'string', 'union', None)
-        21.6±1ms       17.2±0.5ms     0.80  multiindex_object.SetOperations.time_operation('non_monotonic', 'datetime', 'intersection', False)
-        21.9±1ms       17.2±0.2ms     0.79  multiindex_object.SetOperations.time_operation('monotonic', 'ea_int', 'intersection', False)
-      22.0±0.7ms       17.2±0.2ms     0.78  multiindex_object.SetOperations.time_operation('monotonic', 'datetime', 'symmetric_difference', None)
-        28.3±3ms       22.1±0.3ms     0.78  multiindex_object.SetOperations.time_operation('monotonic', 'datetime', 'union', None)
-      14.5±0.9ms       11.1±0.2ms     0.77  multiindex_object.SetOperations.time_operation('non_monotonic', 'int', 'union', False)
-        21.9±2ms       16.5±0.9ms     0.75  multiindex_object.SetOperations.time_operation('monotonic', 'ea_int', 'symmetric_difference', False)
-        15.7±2ms       11.0±0.4ms     0.70  multiindex_object.SetOperations.time_operation('non_monotonic', 'ea_int', 'union', False)
-        15.9±1ms       11.1±0.1ms     0.69  multiindex_object.SetOperations.time_operation('non_monotonic', 'datetime', 'union', False)
-        17.4±1ms       10.9±0.2ms     0.62  multiindex_object.SetOperations.time_operation('monotonic', 'int', 'union', False)
-        30.1±3ms       17.5±0.4ms     0.58  multiindex_object.SetOperations.time_operation('non_monotonic', 'ea_int', 'symmetric_difference', None)
-      31.2±0.4ms       17.8±0.5ms     0.57  multiindex_object.SetOperations.time_operation('non_monotonic', 'int', 'intersection', False)
-        30.4±1ms         17.3±1ms     0.57  multiindex_object.SetOperations.time_operation('monotonic', 'int', 'symmetric_difference', None)
-      6.99±0.3ms       3.96±0.1ms     0.57  multiindex_object.Isin.time_isin_small('int')
-        11.6±1ms       6.48±0.3ms     0.56  multiindex_object.Isin.time_isin_large('int')
-        32.3±3ms       17.6±0.8ms     0.55  multiindex_object.SetOperations.time_operation('non_monotonic', 'string', 'symmetric_difference', None)
-      7.81±0.6ms       4.24±0.8ms     0.54  multiindex_object.Isin.time_isin_small('datetime')
-        22.5±3ms       12.0±0.7ms     0.53  multiindex_object.SetOperations.time_operation('monotonic', 'datetime', 'union', False)
-        31.0±3ms       15.6±0.2ms     0.50  multiindex_object.SetOperations.time_operation('non_monotonic', 'string', 'symmetric_difference', False)
-        34.0±2ms       17.0±0.7ms     0.50  multiindex_object.SetOperations.time_operation('monotonic', 'string', 'intersection', False)
-        31.5±2ms       15.6±0.2ms     0.50  multiindex_object.SetOperations.time_operation('monotonic', 'string', 'symmetric_difference', False)
-        34.0±2ms      16.4±0.09ms     0.48  multiindex_object.SetOperations.time_operation('monotonic', 'string', 'symmetric_difference', None)
-      11.9±0.6ms       5.57±0.2ms     0.47  multiindex_object.Isin.time_isin_large('datetime')
-        24.0±2ms       11.0±0.4ms     0.46  multiindex_object.SetOperations.time_operation('monotonic', 'string', 'union', False)
-      36.1±0.8ms       15.8±0.7ms     0.44  multiindex_object.SetOperations.time_operation('non_monotonic', 'string', 'intersection', False)
-      24.4±0.3ms       10.6±0.3ms     0.44  multiindex_object.SetOperations.time_operation('non_monotonic', 'string', 'union', False)
-      17.1±0.8ms       5.45±0.7ms     0.32  multiindex_object.Isin.time_isin_large('string')
-      12.8±0.8ms      3.59±0.03ms     0.28  multiindex_object.Isin.time_isin_small('string')