PERF: significantly improve performance of MultiIndex.shape by qwhelan · Pull Request #27384 · pandas-dev/pandas (original) (raw)

MultiIndex.shape is currently extremely slow as it triggers the creation of ._values, which can be quite expensive for datetime levels. The one mitigating factor is that this result is cached and thus making ._values.shape near-instant on subsequent calls, but also hard to catch in asv benchmarks; this commit adds a suite dedicated to measuring such cached properties on Index objects.

asv results show a ~400,000x speedup for a relatively straightforward case:

       before           after         ratio
     [269d3681]       [d205acf6]
     <master>       <shape>   
-      3.52±0.02s       8.33±0.2μs     0.00  index_cached_properties.MultiIndexCached.time_shape