PERF: json_normalize, for basic use case by smpurkis · Pull Request #40035 · pandas-dev/pandas (original) (raw)
Found the issue, my checking of the parameters was incorrect. Reran the benchmark.
before after ratio
[a241cfc6] [5959eaab]
<issue-15621-improve-json-normalize-perf> <master>
- 317±1ms 65.5±2ms 0.21 io.json.NormalizeJSON.time_normalize_json('values', 'df_date_idx')
- 317±0.5ms 65.4±0.3ms 0.21 io.json.NormalizeJSON.time_normalize_json('split', 'df_date_idx')
- 317±2ms 65.3±0.9ms 0.21 io.json.NormalizeJSON.time_normalize_json('values', 'df_td_int_ts')
- 316±2ms 65.2±0.5ms 0.21 io.json.NormalizeJSON.time_normalize_json('index', 'df_date_idx')
- 317±1ms 65.4±0.4ms 0.21 io.json.NormalizeJSON.time_normalize_json('index', 'df_int_floats')
- 316±0.9ms 65.1±0.3ms 0.21 io.json.NormalizeJSON.time_normalize_json('values', 'df')
- 315±0.8ms 64.9±0.1ms 0.21 io.json.NormalizeJSON.time_normalize_json('columns', 'df_td_int_ts')
- 316±0.4ms 65.0±0.2ms 0.21 io.json.NormalizeJSON.time_normalize_json('index', 'df')
- 316±0.6ms 64.9±0.4ms 0.21 io.json.NormalizeJSON.time_normalize_json('split', 'df_td_int_ts')
- 316±2ms 64.9±0.3ms 0.21 io.json.NormalizeJSON.time_normalize_json('split', 'df_int_floats')
- 316±0.6ms 64.8±0.2ms 0.21 io.json.NormalizeJSON.time_normalize_json('records', 'df')
- 317±1ms 65.0±0.2ms 0.21 io.json.NormalizeJSON.time_normalize_json('records', 'df_date_idx')
- 317±1ms 65.0±0.5ms 0.21 io.json.NormalizeJSON.time_normalize_json('index', 'df_int_float_str')
- 317±0.4ms 65.1±0.2ms 0.20 io.json.NormalizeJSON.time_normalize_json('records', 'df_int_floats')
- 316±0.7ms 64.8±0.2ms 0.20 io.json.NormalizeJSON.time_normalize_json('values', 'df_int_floats')
- 316±0.2ms 64.8±0.2ms 0.20 io.json.NormalizeJSON.time_normalize_json('columns', 'df_int_floats')
- 315±1ms 64.5±0.3ms 0.20 io.json.NormalizeJSON.time_normalize_json('split', 'df_int_float_str')
- 318±1ms 64.9±0.1ms 0.20 io.json.NormalizeJSON.time_normalize_json('split', 'df')
- 317±0.9ms 64.8±0.2ms 0.20 io.json.NormalizeJSON.time_normalize_json('columns', 'df_date_idx')
- 317±1ms 64.8±0.2ms 0.20 io.json.NormalizeJSON.time_normalize_json('values', 'df_int_float_str')
- 318±1ms 64.8±0.3ms 0.20 io.json.NormalizeJSON.time_normalize_json('records', 'df_td_int_ts')
- 318±0.8ms 64.8±0.4ms 0.20 io.json.NormalizeJSON.time_normalize_json('records', 'df_int_float_str')
- 319±5ms 65.1±0.5ms 0.20 io.json.NormalizeJSON.time_normalize_json('columns', 'df_int_float_str')
- 317±2ms 64.6±0.2ms 0.20 io.json.NormalizeJSON.time_normalize_json('columns', 'df')
- 319±2ms 64.9±0.4ms 0.20 io.json.NormalizeJSON.time_normalize_json('index', 'df_td_int_ts')
SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.