PERF: implement scalar ops blockwise by jbrockmendel · Pull Request #29853 · pandas-dev/pandas (original) (raw)
Resolved the issue with test_expressions behaving unexpectedly.
Added an asv that times operations on a homogeneous-dtype DataFrame (rows=20k, cols=100) with a scalar. Not sure how many variants of these to do; could be easy to go overboard.
before after ratio
[0cd388fd] [1fc1e3ec]
<cy30> <back-to-arith>
- 110±4ms 59.6±1ms 0.54 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function floordiv>)
- 115±3ms 59.5±1ms 0.52 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function floordiv>)
- 88.2±1ms 44.1±0.7ms 0.50 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function floordiv>)
- 91.0±3ms 44.7±0.9ms 0.49 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function floordiv>)
- 94.8±3ms 46.5±1ms 0.49 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function floordiv>)
- 92.1±2ms 44.9±0.4ms 0.49 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function floordiv>)
- 93.9±4ms 42.2±0.2ms 0.45 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function floordiv>)
- 94.6±2ms 41.7±1ms 0.44 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function floordiv>)
- 78.0±4ms 28.6±0.4ms 0.37 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function pow>)
- 66.9±1ms 22.5±0.6ms 0.34 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function mod>)
- 67.8±2ms 22.1±1ms 0.33 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function mod>)
- 70.0±3ms 22.5±0.5ms 0.32 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function pow>)
- 67.3±2ms 20.8±0.6ms 0.31 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function mod>)
- 66.7±1ms 20.3±1ms 0.30 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function mod>)
- 64.9±1ms 19.3±0.5ms 0.30 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function mod>)
- 65.5±0.7ms 19.1±0.8ms 0.29 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function mod>)
- 73.2±1ms 18.2±1ms 0.25 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function mod>)
- 74.7±3ms 18.3±1ms 0.25 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function mod>)
- 60.3±1ms 9.87±0.1ms 0.16 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function pow>)
- 60.0±2ms 9.80±0.3ms 0.16 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function pow>)
- 59.4±3ms 9.65±0.4ms 0.16 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function pow>)
- 58.5±2ms 9.09±0.3ms 0.16 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function pow>)
- 42.6±0.9ms 3.96±0.09ms 0.09 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function xor>)
- 51.3±0.4ms 4.70±0.1ms 0.09 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function pow>)
- 52.7±2ms 4.68±0.1ms 0.09 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function pow>)
- 42.3±0.8ms 3.54±0.1ms 0.08 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function or_>)
- 43.4±2ms 3.36±0.2ms 0.08 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function and_>)
- 43.0±1ms 3.23±0.2ms 0.08 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function xor>)
- 44.9±1ms 3.36±0.2ms 0.07 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function or_>)
- 45.6±0.4ms 3.41±0.1ms 0.07 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function and_>)
- 49.8±1ms 3.24±0.1ms 0.07 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function mul>)
- 28.0±0.5ms 1.81±0.2ms 0.06 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function eq>)
- 48.8±0.7ms 3.15±0.07ms 0.06 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function truediv>)
- 50.1±0.5ms 3.15±0.09ms 0.06 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function mul>)
- 50.1±0.8ms 3.14±0.1ms 0.06 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function truediv>)
- 51.4±1ms 3.21±0.1ms 0.06 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function truediv>)
- 49.6±0.3ms 3.10±0.05ms 0.06 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function sub>)
- 49.5±0.9ms 3.08±0.06ms 0.06 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function mul>)
- 51.3±1ms 3.17±0.1ms 0.06 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function add>)
- 50.0±1ms 3.07±0.09ms 0.06 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function add>)
- 28.7±0.8ms 1.77±0.1ms 0.06 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function gt>)
- 50.4±0.8ms 3.06±0.1ms 0.06 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function add>)
- 29.1±1ms 1.76±0.03ms 0.06 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function ne>)
- 50.1±1ms 3.02±0.1ms 0.06 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function truediv>)
- 49.8±0.7ms 2.99±0.07ms 0.06 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function truediv>)
- 50.4±1ms 3.02±0.1ms 0.06 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function truediv>)
- 50.9±1ms 3.04±0.1ms 0.06 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function mul>)
- 50.8±1ms 3.03±0.09ms 0.06 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function truediv>)
- 50.8±1ms 3.03±0.3ms 0.06 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function sub>)
- 52.2±0.9ms 3.08±0.09ms 0.06 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function mul>)
- 54.0±1ms 3.17±0.09ms 0.06 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function add>)
- 51.1±0.7ms 2.98±0.1ms 0.06 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function add>)
- 50.8±1ms 2.96±0.06ms 0.06 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function sub>)
- 53.7±1ms 3.12±0.04ms 0.06 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function sub>)
- 52.7±2ms 3.05±0.08ms 0.06 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function truediv>)
- 51.2±1ms 2.96±0.1ms 0.06 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function sub>)
- 28.6±1ms 1.65±0.07ms 0.06 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function le>)
- 29.9±0.6ms 1.72±0.1ms 0.06 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function ge>)
- 29.7±2ms 1.69±0.04ms 0.06 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function lt>)
- 54.0±0.7ms 3.06±0.1ms 0.06 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function sub>)
- 55.1±0.9ms 3.12±0.1ms 0.06 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function add>)
- 50.6±0.8ms 2.86±0.1ms 0.06 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function add>)
- 59.1±2ms 3.31±0.1ms 0.06 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function sub>)
- 29.4±1ms 1.65±0.08ms 0.06 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function ge>)
- 53.5±1ms 2.97±0.1ms 0.06 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function mul>)
- 30.9±2ms 1.69±0.1ms 0.05 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function lt>)
- 29.2±2ms 1.59±0.09ms 0.05 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function eq>)
- 54.9±0.9ms 2.98±0.2ms 0.05 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function mul>)
- 54.2±1ms 2.93±0.1ms 0.05 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function add>)
- 54.5±1ms 2.92±0.08ms 0.05 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function sub>)
- 29.8±2ms 1.58±0.1ms 0.05 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function gt>)
- 58.2±3ms 3.08±0.2ms 0.05 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function mul>)
- 31.2±1ms 1.61±0.1ms 0.05 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function le>)
- 31.0±2ms 1.59±0.1ms 0.05 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function ne>)
- 27.8±1ms 1.29±0.06ms 0.05 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function ge>)
- 27.4±0.8ms 1.22±0.1ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function le>)
- 28.0±1ms 1.17±0.04ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function le>)
- 27.8±0.6ms 1.16±0.1ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function lt>)
- 27.6±0.9ms 1.15±0.03ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function lt>)
- 27.8±0.4ms 1.16±0.07ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function le>)
- 29.8±2ms 1.24±0.06ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function ne>)
- 27.7±0.8ms 1.15±0.07ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function lt>)
- 27.4±0.6ms 1.14±0.08ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function le>)
- 28.5±0.8ms 1.18±0.05ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function ne>)
- 27.8±0.7ms 1.15±0.01ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function le>)
- 28.5±0.6ms 1.16±0.06ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function gt>)
- 27.1±0.8ms 1.11±0.03ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function lt>)
- 27.5±0.6ms 1.12±0.05ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function gt>)
- 27.4±0.4ms 1.11±0.03ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function ne>)
- 27.1±0.9ms 1.10±0.05ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function ge>)
- 28.6±0.4ms 1.14±0.06ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function eq>)
- 27.4±0.6ms 1.09±0.02ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function eq>)
- 28.2±0.7ms 1.12±0.05ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function ne>)
- 27.3±0.6ms 1.08±0.03ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function ge>)
- 28.1±1ms 1.12±0.02ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function gt>)
- 28.3±1ms 1.12±0.03ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function gt>)
- 28.1±0.8ms 1.11±0.05ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function lt>)
- 27.1±0.8ms 1.07±0.03ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function le>)
- 27.9±1ms 1.10±0.03ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function ge>)
- 28.2±0.5ms 1.09±0.06ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function lt>)
- 29.0±0.5ms 1.11±0.04ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function ne>)
- 29.0±2ms 1.11±0.04ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function gt>)
- 29.2±0.2ms 1.11±0.03ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function eq>)
- 28.2±2ms 1.07±0.03ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function ge>)
- 29.1±1ms 1.10±0.04ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function gt>)
- 29.5±0.2ms 1.11±0.05ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function eq>)
- 30.2±2ms 1.13±0.04ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function ge>)
- 28.4±0.2ms 1.05±0.03ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function eq>)
- 29.9±1ms 1.09±0.05ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function ne>)
- 29.1±0.6ms 1.03±0.02ms 0.04 binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function eq>)
SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.