PERF: json_normalize, for basic use case by smpurkis · Pull Request #40035 · pandas-dev/pandas (original) (raw)

Found the issue, my checking of the parameters was incorrect. Reran the benchmark.

       before           after         ratio
     [a241cfc6]       [5959eaab]
     <issue-15621-improve-json-normalize-perf>       <master>  
-         317±1ms         65.5±2ms     0.21  io.json.NormalizeJSON.time_normalize_json('values', 'df_date_idx')
-       317±0.5ms       65.4±0.3ms     0.21  io.json.NormalizeJSON.time_normalize_json('split', 'df_date_idx')
-         317±2ms       65.3±0.9ms     0.21  io.json.NormalizeJSON.time_normalize_json('values', 'df_td_int_ts')
-         316±2ms       65.2±0.5ms     0.21  io.json.NormalizeJSON.time_normalize_json('index', 'df_date_idx')
-         317±1ms       65.4±0.4ms     0.21  io.json.NormalizeJSON.time_normalize_json('index', 'df_int_floats')
-       316±0.9ms       65.1±0.3ms     0.21  io.json.NormalizeJSON.time_normalize_json('values', 'df')
-       315±0.8ms       64.9±0.1ms     0.21  io.json.NormalizeJSON.time_normalize_json('columns', 'df_td_int_ts')
-       316±0.4ms       65.0±0.2ms     0.21  io.json.NormalizeJSON.time_normalize_json('index', 'df')
-       316±0.6ms       64.9±0.4ms     0.21  io.json.NormalizeJSON.time_normalize_json('split', 'df_td_int_ts')
-         316±2ms       64.9±0.3ms     0.21  io.json.NormalizeJSON.time_normalize_json('split', 'df_int_floats')
-       316±0.6ms       64.8±0.2ms     0.21  io.json.NormalizeJSON.time_normalize_json('records', 'df')
-         317±1ms       65.0±0.2ms     0.21  io.json.NormalizeJSON.time_normalize_json('records', 'df_date_idx')
-         317±1ms       65.0±0.5ms     0.21  io.json.NormalizeJSON.time_normalize_json('index', 'df_int_float_str')
-       317±0.4ms       65.1±0.2ms     0.20  io.json.NormalizeJSON.time_normalize_json('records', 'df_int_floats')
-       316±0.7ms       64.8±0.2ms     0.20  io.json.NormalizeJSON.time_normalize_json('values', 'df_int_floats')
-       316±0.2ms       64.8±0.2ms     0.20  io.json.NormalizeJSON.time_normalize_json('columns', 'df_int_floats')
-         315±1ms       64.5±0.3ms     0.20  io.json.NormalizeJSON.time_normalize_json('split', 'df_int_float_str')
-         318±1ms       64.9±0.1ms     0.20  io.json.NormalizeJSON.time_normalize_json('split', 'df')
-       317±0.9ms       64.8±0.2ms     0.20  io.json.NormalizeJSON.time_normalize_json('columns', 'df_date_idx')
-         317±1ms       64.8±0.2ms     0.20  io.json.NormalizeJSON.time_normalize_json('values', 'df_int_float_str')
-         318±1ms       64.8±0.3ms     0.20  io.json.NormalizeJSON.time_normalize_json('records', 'df_td_int_ts')
-       318±0.8ms       64.8±0.4ms     0.20  io.json.NormalizeJSON.time_normalize_json('records', 'df_int_float_str')
-         319±5ms       65.1±0.5ms     0.20  io.json.NormalizeJSON.time_normalize_json('columns', 'df_int_float_str')
-         317±2ms       64.6±0.2ms     0.20  io.json.NormalizeJSON.time_normalize_json('columns', 'df')
-         319±2ms       64.9±0.4ms     0.20  io.json.NormalizeJSON.time_normalize_json('index', 'df_td_int_ts')

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.