PERF: significantly improve performance of MultiIndex.shape by qwhelan · Pull Request #27384 · pandas-dev/pandas (original) (raw)
MultiIndex.shape
is currently extremely slow as it triggers the creation of ._values
, which can be quite expensive for datetime levels. The one mitigating factor is that this result is cached and thus making ._values.shape
near-instant on subsequent calls, but also hard to catch in asv
benchmarks; this commit adds a suite dedicated to measuring such cached properties on Index
objects.
asv
results show a ~400,000x
speedup for a relatively straightforward case:
before after ratio
[269d3681] [d205acf6]
<master> <shape>
- 3.52±0.02s 8.33±0.2μs 0.00 index_cached_properties.MultiIndexCached.time_shape
- closes #xxxx
- tests added / passed
- passes
git diff upstream/master -u -- "*.py" | flake8 --diff
- whatsnew entry