BUG/PERF: merge_asof raising TypeError for various "by" column dtypes by lukemanley · Pull Request #55678 · pandas-dev/pandas (original) (raw)
merge_asof
currently raises if the dtype of by
is something other than int64, uint64, or object. This PR removes that limitation.
| Change | Before [e48df1cf] <main> | After [a53a5533] <merge-asof-by-dtypes> | Ratio | Benchmark (Parameter) |
|----------|----------------------------|-------------------------------------------|---------|------------------------------------------------------|
| - | 299±30ms | 199±20ms | 0.67 | join_merge.MergeAsof.time_multiby('backward', 5) |
| - | 311±30ms | 202±20ms | 0.65 | join_merge.MergeAsof.time_multiby('backward', None) |
| - | 292±20ms | 158±20ms | 0.54 | join_merge.MergeAsof.time_by_object('forward', None) |
| - | 302±30ms | 157±10ms | 0.52 | join_merge.MergeAsof.time_by_object('forward', 5) |
| - | 411±10ms | 200±5ms | 0.49 | join_merge.MergeAsof.time_multiby('forward', 5) |
| - | 420±30ms | 202±20ms | 0.48 | join_merge.MergeAsof.time_by_object('nearest', None) |
| - | 457±20ms | 215±8ms | 0.47 | join_merge.MergeAsof.time_multiby('forward', None) |
| - | 515±10ms | 241±6ms | 0.47 | join_merge.MergeAsof.time_multiby('nearest', 5) |
| - | 519±10ms | 242±4ms | 0.47 | join_merge.MergeAsof.time_multiby('nearest', None) |
| - | 419±40ms | 185±20ms | 0.44 | join_merge.MergeAsof.time_by_object('nearest', 5) |
| - | 159±20ms | 64.5±8ms | 0.41 | join_merge.MergeAsof.time_by_int('backward', 5) |
| - | 154±10ms | 63.5±6ms | 0.41 | join_merge.MergeAsof.time_by_int('backward', None) |
| - | 437±20ms | 123±20ms | 0.28 | join_merge.MergeAsof.time_by_int('nearest', 5) |
| - | 328±30ms | 83.9±3ms | 0.26 | join_merge.MergeAsof.time_by_int('forward', None) |
| - | 470±30ms | 124±10ms | 0.26 | join_merge.MergeAsof.time_by_int('nearest', None) |
| - | 324±30ms | 81.7±3ms | 0.25 | join_merge.MergeAsof.time_by_int('forward', 5) |