ENH: Keep dtypes in MultiIndex.union without NAs by phofl · Pull Request #48505 · pandas-dev/pandas (original) (raw)
fast_unique_multiple
was never used with more than 2 arrays, so no need to keep the implementation around. Returning indices from the cython level allows us to operate on the initial object and hence keeping the dtypes.
[fe9e5d02] [e690752b]
<midx_union_no_na~4> <midx_union_no_na>
- 59.6±0.3ms 53.0±0.8ms 0.89 multiindex_object.SetOperations.time_operation('non_monotonic', 'string', 'union')
- 44.2±0.2ms 25.7±0.2ms 0.58 multiindex_object.SetOperations.time_operation('non_monotonic', 'int', 'union')
- 44.5±1ms 25.7±0.5ms 0.58 multiindex_object.SetOperations.time_operation('monotonic', 'int', 'union')
- 115±10ms 49.1±0.5ms 0.43 multiindex_object.SetOperations.time_operation('monotonic', 'datetime', 'union')
- 114±0.3ms 48.0±0.7ms 0.42 multiindex_object.SetOperations.time_operation('non_monotonic', 'datetime', 'union')
As a follow up we could improve the cython implementation to handle duplicates in right too