PERF: Utilize mixed dtypes in df.count() with MultiIndexes by qwhelan · Pull Request #9163 · pandas-dev/pandas (original) (raw)
@jreback Same underlying cause as #9136 but the solution is a bit more involved here.
Basically, we're transposing a potentially mixed-type frame before calling notnull(frame.values)
; the same result can be obtained by deferring the transpose until after notnull
gives us a non-mixed frame.
Here are the vbench results:
-------------------------------------------------------------------------------
Test name | head[ms] | base[ms] | ratio |
-------------------------------------------------------------------------------
frame_count_level_axis1_mixed_dtypes_multi | 82.2484 | 1489.9830 | 0.0552 |
frame_count_level_axis0_mixed_dtypes_multi | 101.2537 | 1737.6347 | 0.0583 |
frame_count_level_axis0_multi | 51.0643 | 51.3713 | 0.9940 |
frame_count_level_axis1_multi | 98.5887 | 82.6767 | 1.1925 |
-------------------------------------------------------------------------------
Test name | head[ms] | base[ms] | ratio |
-------------------------------------------------------------------------------
Ratio < 1.0 means the target commit is faster then the baseline.
Seed used: 1234
Target [04ec4c6] : PERF: Utilize mixed dtypes in df.count() with MultiIndexes
Base [def58c9] : Merge pull request #9128 from hsperr/expanduser