PERF: Utilize mixed dtypes in df.count() with MultiIndexes by qwhelan · Pull Request #9163 · pandas-dev/pandas (original) (raw)

@jreback Same underlying cause as #9136 but the solution is a bit more involved here.

Basically, we're transposing a potentially mixed-type frame before calling notnull(frame.values); the same result can be obtained by deferring the transpose until after notnull gives us a non-mixed frame.

Here are the vbench results:

-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------
frame_count_level_axis1_mixed_dtypes_multi   |  82.2484 | 1489.9830 |   0.0552 |
frame_count_level_axis0_mixed_dtypes_multi   | 101.2537 | 1737.6347 |   0.0583 |
frame_count_level_axis0_multi                |  51.0643 |  51.3713 |   0.9940 |
frame_count_level_axis1_multi                |  98.5887 |  82.6767 |   1.1925 |
-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------

Ratio < 1.0 means the target commit is faster then the baseline.
Seed used: 1234

Target [04ec4c6] : PERF: Utilize mixed dtypes in df.count() with MultiIndexes
Base   [def58c9] : Merge pull request #9128 from hsperr/expanduser