PERF: ~40x speedup in sparse init and ops by using numpy in check_integrity by qwhelan · Pull Request #24985 · pandas-dev/pandas (original) (raw)
A pretty significant regression was introduced into SparseArray
operations around the release of v0.20.0:
A run of asv find
identified #15863 as the source; simply using numpy operations rather than python-in-cython yields a ~40x speedup:
$ asv compare master HEAD -s --sort ratio
Benchmarks that have improved:
before after ratio
[2b16e2e6] [0e35de9b]
<master> <sparse_check_integrity>
- 46.4±0.4ms 42.0±0.1ms 0.91 sparse.SparseArrayConstructor.time_sparse_array(0.01, nan, <class 'object'>)
- 216±0.4μs 193±0.4μs 0.89 sparse.Arithmetic.time_intersect(0.01, 0)
- 60.1±0.2ms 48.1±0.1ms 0.80 sparse.SparseArrayConstructor.time_sparse_array(0.01, 0, <class 'object'>)
- 1.02±0.01s 594±6ms 0.58 reshape.GetDummies.time_get_dummies_1d_sparse
- 96.3±10ms 54.1±0.2ms 0.56 sparse.SparseArrayConstructor.time_sparse_array(0.1, nan, <class 'object'>)
- 105±0.3ms 56.1±0.3ms 0.54 sparse.SparseArrayConstructor.time_sparse_array(0.1, 0, <class 'object'>)
- 7.43±0.2ms 3.17±0.08ms 0.43 sparse.SparseArrayConstructor.time_sparse_array(0.01, nan, <class 'numpy.float64'>)
- 6.67±0.02ms 2.50±0.03ms 0.37 sparse.SparseArrayConstructor.time_sparse_array(0.01, 0, <class 'numpy.int64'>)
- 660±9ms 237±0.2ms 0.36 sparse.Arithmetic.time_make_union(0.1, nan)
- 674±10ms 237±0.2ms 0.35 sparse.Arithmetic.time_make_union(0.01, nan)
- 6.27±0.04ms 2.03±0.02ms 0.32 sparse.SparseArrayConstructor.time_sparse_array(0.01, 0, <class 'numpy.float64'>)
- 5.71±0.08ms 1.58±0.01ms 0.28 sparse.Arithmetic.time_intersect(0.1, 0)
- 50.0±0.6ms 7.94±0.04ms 0.16 sparse.SparseArrayConstructor.time_sparse_array(0.1, nan, <class 'numpy.float64'>)
- 93.4±0.6ms 14.7±0.04ms 0.16 sparse.Arithmetic.time_divide(0.1, 0)
- 93.4±0.6ms 14.5±0.05ms 0.16 sparse.Arithmetic.time_add(0.1, 0)
- 9.69±0.03ms 1.46±0ms 0.15 sparse.Arithmetic.time_divide(0.01, 0)
- 48.7±0.4ms 7.29±0.01ms 0.15 sparse.SparseArrayConstructor.time_sparse_array(0.1, 0, <class 'numpy.int64'>)
- 9.71±0.04ms 1.45±0ms 0.15 sparse.Arithmetic.time_add(0.01, 0)
- 91.6±0.5ms 12.9±0.04ms 0.14 sparse.Arithmetic.time_make_union(0.1, 0)
- 49.8±0.8ms 6.84±0.01ms 0.14 sparse.SparseArrayConstructor.time_sparse_array(0.1, 0, <class 'numpy.float64'>)
- 9.37±0.01ms 1.18±0ms 0.13 sparse.Arithmetic.time_make_union(0.01, 0)
- 9.60±0.1ms 1.05±0ms 0.11 sparse.ArithmeticBlock.time_division(nan)
- 9.50±0.2ms 1.03±0ms 0.11 sparse.ArithmeticBlock.time_addition(nan)
- 9.49±0.1ms 991±2μs 0.10 sparse.ArithmeticBlock.time_division(0)
- 9.52±0.1ms 981±2μs 0.10 sparse.ArithmeticBlock.time_addition(0)
- 9.19±0.1ms 827±2μs 0.09 sparse.ArithmeticBlock.time_make_union(nan)
- 9.01±0.1ms 796±2μs 0.09 sparse.ArithmeticBlock.time_make_union(0)
- 4.31±0.04ms 156±0.3μs 0.04 sparse.ArithmeticBlock.time_intersect(nan)
- 4.33±0.09ms 156±0.5μs 0.04 sparse.ArithmeticBlock.time_intersect(0)
- 464±20ms 14.6±1ms 0.03 sparse.SparseArrayConstructor.time_sparse_array(0.01, nan, <class 'numpy.int64'>)
- 439±4ms 12.1±1ms 0.03 sparse.SparseArrayConstructor.time_sparse_array(0.1, nan, <class 'numpy.int64'>)
- 440±2ms 10.7±0.01ms 0.02 sparse.Arithmetic.time_intersect(0.01, nan)
- 444±10ms 10.7±0.01ms 0.02 sparse.Arithmetic.time_intersect(0.1, nan)
Benchmarks that have stayed the same:
before after ratio
[2b16e2e6] [0e35de9b]
<master> <sparse_check_integrity>
246±10ms 251±7ms 1.02 sparse.SparseSeriesToFrame.time_series_to_frame
597±2ms 605±10ms 1.01 sparse.SparseDataFrameConstructor.time_from_scipy
6.21±0s 6.26±0.02s 1.01 sparse.SparseDataFrameConstructor.time_constructor
2.87±0.01ms 2.88±0.04ms 1.01 sparse.FromCoo.time_sparse_series_from_coo
246±1ms 247±2ms 1.00 sparse.SparseDataFrameConstructor.time_from_dict
2.29±0.05ms 2.29±0.01ms 1.00 sparse.Arithmetic.time_add(0.1, nan)
2.06±0.01ms 2.06±0.01ms 1.00 reshape.SparseIndex.time_unstack
4.61±0.01ms 4.62±0.01ms 1.00 sparse.Arithmetic.time_divide(0.1, nan)
4.05±0.01ms 4.04±0.01ms 1.00 sparse.Arithmetic.time_divide(0.01, nan)
46.2±0.3ms 46.1±0.1ms 1.00 sparse.ToCoo.time_sparse_series_to_coo
2.31±0.03ms 2.29±0.02ms 0.99 sparse.Arithmetic.time_add(0.01, nan)