BENCH: programmatically create benchmarks for large ngroups (GH6787) by dlovell · Pull Request #8410 · pandas-dev/pandas (original) (raw)

Uses ngroups=10000 as suggested in the issue, which takes about 1 hour on my desktop.

For results (vb_suite.log, pkl file) see: https://gist.github.com/dlovell/ea3400273314e7612f6e
Note: gist references a different commit hash. I changed the commit message and added modification to doc/source/v0.15.0.txt, but the actual modifications to vb_suite/groupby.py are principally the same.