PERF: Discrepancy in groupby methods · Issue #19165 · pandas-dev/pandas (original) (raw)
Navigation Menu
- Explore
- Pricing
Provide feedback
Saved searches
Use saved searches to filter your results more quickly
Appearance settings
Description
issue filter for groupby & perf
.describe()
.mad()
.pct_change()
(Cythonized GroupBy pct_change #19919).rank()
(PERF: cythonize groupby-rank #15779) (PERF: Cythonize Groupby Rank #19481).all()
and.any()
(Boolean operations in groupby objects are extremely slow compared to numpy counterpart. #15435)fillna()
(PERF: groupby-fillna perf, implement in cython #11296) (Cythonized GroupBy Fill #19673)mode()
(Groupby.mode() - feature request #19254)
Some groupby methods (notably describe
, mad
, pct_change
) are not as performant as others. Many of the less performant methods are pre-generated in a _common_apply_whitelist
in pandas/core/groupby.py
, so it may be worthwhile to revisit this implementation.
asv dev -b ^groupby.GroupByMethods
· Discovering benchmarks
· Running 1 total benchmarks (1 commits * 1 environments * 1 benchmarks)
[ 0.00%] ·· Building for existing-py_home_matt_anaconda_envs_pandas_dev_bin_python
[ 0.00%] ·· Benchmarking existing-py_home_matt_anaconda_envs_pandas_dev_bin_python
[100.00%] ··· Running groupby.GroupByMethods.time_method ok
[100.00%] ····
======= ============== ========
dtype method
------- -------------- --------
int all 256ms
int any 255ms
int count 925μs
int cumcount 1.15ms
int cummax 1.13ms
int cummin 1.15ms
int cumprod 1.58ms
int cumsum 1.16ms
int describe 3.25s
int first 1.12ms
int head 1.37ms
int last 1.12ms
int mad 1.42s
int max 1.12ms
int min 1.16ms
int median 1.53ms
int mean 1.43ms
int nunique 1.40ms
int pct_change 1.56s
int prod 1.53ms
int rank 380ms
int sem 414ms
int shift 974μs
int size 858μs
int skew 414ms
int std 1.46ms
int sum 1.50ms
int tail 1.45ms
int unique 289ms
int value_counts 2.35ms
int var 1.34ms
float all 402ms
float any 406ms
float count 1.18ms
float cumcount 1.33ms
float cummax 1.40ms
float cummin 1.40ms
float cumprod 1.75ms
float cumsum 1.40ms
float describe 5.02s
float first 1.37ms
float head 1.58ms
float last 1.36ms
float mad 2.01s
float max 1.38ms
float min 1.37ms
float median 1.80ms
float mean 1.79ms
float nunique 1.60ms
float pct_change 2.17s
float prod 1.75ms
float rank 623ms
float sem 416ms
float shift 1.18ms
float size 1.09ms
float skew 646ms
float std 1.51ms
float sum 1.77ms
float tail 1.63ms
float unique 457ms
float value_counts 2.63ms
float var 1.43ms
======= ============== ========