Performance issues with groupby for large values of ngroups · Issue #8426 · pandas-dev/pandas (original) (raw)

Skip to content

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sign up

Appearance settings

@dlovell

Description

@dlovell

Per @jreback: #8410 (comment)

Groupby for large values of ngroups, 10000 in this case, is very slow.

Invoked with :
--ncalls: 3
--repeats: 3


---------------------------------------------------------
Test name                                    |    #0    |
---------------------------------------------------------

...

groupby_large_ngroups_value_counts           | 23527.0356 |
groupby_large_ngroups_nunique                | 19496.5300 |
groupby_large_ngroups_describe               | 82329.8963 |
groupby_large_ngroups_mad                    | 21441.5947 |
groupby_large_ngroups_pct_change             | 18865.5750 |

See also: https://gist.github.com/dlovell/ea3400273314e7612f6e