PERF: ~40x speedup in sparse init and ops by using numpy in check_integrity by qwhelan · Pull Request #24985 · pandas-dev/pandas (original) (raw)

A pretty significant regression was introduced into SparseArray operations around the release of v0.20.0:
089e89af-a40d-4443-ad3b-16018f9f44fd

A run of asv find identified #15863 as the source; simply using numpy operations rather than python-in-cython yields a ~40x speedup:

$ asv compare master HEAD -s --sort ratio

Benchmarks that have improved:

       before           after         ratio
     [2b16e2e6]       [0e35de9b]
     <master>         <sparse_check_integrity>
-      46.4±0.4ms       42.0±0.1ms     0.91  sparse.SparseArrayConstructor.time_sparse_array(0.01, nan, <class 'object'>)
-       216±0.4μs        193±0.4μs     0.89  sparse.Arithmetic.time_intersect(0.01, 0)
-      60.1±0.2ms       48.1±0.1ms     0.80  sparse.SparseArrayConstructor.time_sparse_array(0.01, 0, <class 'object'>)
-      1.02±0.01s          594±6ms     0.58  reshape.GetDummies.time_get_dummies_1d_sparse
-       96.3±10ms       54.1±0.2ms     0.56  sparse.SparseArrayConstructor.time_sparse_array(0.1, nan, <class 'object'>)
-       105±0.3ms       56.1±0.3ms     0.54  sparse.SparseArrayConstructor.time_sparse_array(0.1, 0, <class 'object'>)
-      7.43±0.2ms      3.17±0.08ms     0.43  sparse.SparseArrayConstructor.time_sparse_array(0.01, nan, <class 'numpy.float64'>)
-     6.67±0.02ms      2.50±0.03ms     0.37  sparse.SparseArrayConstructor.time_sparse_array(0.01, 0, <class 'numpy.int64'>)
-         660±9ms        237±0.2ms     0.36  sparse.Arithmetic.time_make_union(0.1, nan)
-        674±10ms        237±0.2ms     0.35  sparse.Arithmetic.time_make_union(0.01, nan)
-     6.27±0.04ms      2.03±0.02ms     0.32  sparse.SparseArrayConstructor.time_sparse_array(0.01, 0, <class 'numpy.float64'>)
-     5.71±0.08ms      1.58±0.01ms     0.28  sparse.Arithmetic.time_intersect(0.1, 0)
-      50.0±0.6ms      7.94±0.04ms     0.16  sparse.SparseArrayConstructor.time_sparse_array(0.1, nan, <class 'numpy.float64'>)
-      93.4±0.6ms      14.7±0.04ms     0.16  sparse.Arithmetic.time_divide(0.1, 0)
-      93.4±0.6ms      14.5±0.05ms     0.16  sparse.Arithmetic.time_add(0.1, 0)
-     9.69±0.03ms         1.46±0ms     0.15  sparse.Arithmetic.time_divide(0.01, 0)
-      48.7±0.4ms      7.29±0.01ms     0.15  sparse.SparseArrayConstructor.time_sparse_array(0.1, 0, <class 'numpy.int64'>)
-     9.71±0.04ms         1.45±0ms     0.15  sparse.Arithmetic.time_add(0.01, 0)
-      91.6±0.5ms      12.9±0.04ms     0.14  sparse.Arithmetic.time_make_union(0.1, 0)
-      49.8±0.8ms      6.84±0.01ms     0.14  sparse.SparseArrayConstructor.time_sparse_array(0.1, 0, <class 'numpy.float64'>)
-     9.37±0.01ms         1.18±0ms     0.13  sparse.Arithmetic.time_make_union(0.01, 0)
-      9.60±0.1ms         1.05±0ms     0.11  sparse.ArithmeticBlock.time_division(nan)
-      9.50±0.2ms         1.03±0ms     0.11  sparse.ArithmeticBlock.time_addition(nan)
-      9.49±0.1ms          991±2μs     0.10  sparse.ArithmeticBlock.time_division(0)
-      9.52±0.1ms          981±2μs     0.10  sparse.ArithmeticBlock.time_addition(0)
-      9.19±0.1ms          827±2μs     0.09  sparse.ArithmeticBlock.time_make_union(nan)
-      9.01±0.1ms          796±2μs     0.09  sparse.ArithmeticBlock.time_make_union(0)
-     4.31±0.04ms        156±0.3μs     0.04  sparse.ArithmeticBlock.time_intersect(nan)
-     4.33±0.09ms        156±0.5μs     0.04  sparse.ArithmeticBlock.time_intersect(0)
-        464±20ms         14.6±1ms     0.03  sparse.SparseArrayConstructor.time_sparse_array(0.01, nan, <class 'numpy.int64'>)
-         439±4ms         12.1±1ms     0.03  sparse.SparseArrayConstructor.time_sparse_array(0.1, nan, <class 'numpy.int64'>)
-         440±2ms      10.7±0.01ms     0.02  sparse.Arithmetic.time_intersect(0.01, nan)
-        444±10ms      10.7±0.01ms     0.02  sparse.Arithmetic.time_intersect(0.1, nan)

Benchmarks that have stayed the same:

       before           after         ratio
     [2b16e2e6]       [0e35de9b]
     <master>         <sparse_check_integrity>
         246±10ms          251±7ms     1.02  sparse.SparseSeriesToFrame.time_series_to_frame
          597±2ms         605±10ms     1.01  sparse.SparseDataFrameConstructor.time_from_scipy
          6.21±0s       6.26±0.02s     1.01  sparse.SparseDataFrameConstructor.time_constructor
      2.87±0.01ms      2.88±0.04ms     1.01  sparse.FromCoo.time_sparse_series_from_coo
          246±1ms          247±2ms     1.00  sparse.SparseDataFrameConstructor.time_from_dict
      2.29±0.05ms      2.29±0.01ms     1.00  sparse.Arithmetic.time_add(0.1, nan)
      2.06±0.01ms      2.06±0.01ms     1.00  reshape.SparseIndex.time_unstack
      4.61±0.01ms      4.62±0.01ms     1.00  sparse.Arithmetic.time_divide(0.1, nan)
      4.05±0.01ms      4.04±0.01ms     1.00  sparse.Arithmetic.time_divide(0.01, nan)
       46.2±0.3ms       46.1±0.1ms     1.00  sparse.ToCoo.time_sparse_series_to_coo
      2.31±0.03ms      2.29±0.02ms     0.99  sparse.Arithmetic.time_add(0.01, nan)