BUG: transform and filter misbehave when grouping on categorical data by evanpw · Pull Request #9994 · pandas-dev/pandas (original) (raw)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is about a 25% slowdown in some cases:

-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------
groupby_transform_series2                    | 124.4840 | 124.6049 |   0.9990 |
groupby_transform                            | 152.5413 | 151.6960 |   1.0056 |
groupby_transform_series                     |  20.0484 |  19.6927 |   1.0181 |
groupby_transform_ufunc                      | 112.1867 | 110.1057 |   1.0189 |
groupby_transform_multi_key4                 | 156.6887 | 138.9330 |   1.1278 |
groupby_transform_multi_key3                 | 918.0430 | 758.8653 |   1.2098 |
groupby_transform_multi_key2                 |  57.8043 |  46.5036 |   1.2430 |
groupby_transform_multi_key1                 |  87.1673 |  69.4257 |   1.2555 |

but that's because of the rest of the change, not this particular piece. (Because anything with a MultiIndex already takes the slow path). The groupby_transform_ufunc test is the relevant one, and there's no change there.

This piece of the change also fixes GH #9700. I'll add a test for that.