Performance in SimpleJobRepository.update(StepExecution) since Spring Batch 6.x (original) (raw)

Please do a quick search on Github issues first, there might be already a duplicate issue for the one you are about to create.
If the bug is trivial, just go ahead and create the issue. Otherwise, please take a few moments and fill in the following sections:

Bug description

SimpleJobRepository.update(StepExecution) was changed in Spring Batch 6.x to call getStepExecution(stepExecution.getId()), which triggers a query against BATCH_JOB_EXECUTION_PARAMS on every chunk commit. On databases such as Oracle, which do not automatically create an index for foreign key constraints, JOB_EXECUTION_ID has no index on BATCH_JOB_EXECUTION_PARAMS, resulting in a full table scan on every chunk commit. As historical job execution data accumulates, this causes a severe and growing performance regression.

For reference, in Spring Batch 5.x, update(StepExecution) called checkForInterruption()synchronizeStatus(), which issued only a single lightweight query:

SELECT VERSION FROM BATCH_JOB_EXECUTION WHERE JOB_EXECUTION_ID = ?

In Spring Batch 6.x, getStepExecution(stepExecution.getId()) triggers the following 8 queries on every chunk commit:

# Table Description
1 BATCH_JOB_EXECUTION + BATCH_STEP_EXECUTION fetch JOB_EXECUTION_ID
2 BATCH_JOB_INSTANCE + BATCH_JOB_EXECUTION fetch JOB_INSTANCE_ID
3 BATCH_JOB_INSTANCE fetch job instance
4 BATCH_JOB_EXECUTION_PARAMS fetch job parameters ⚠️ full table scan on Oracle
5 BATCH_JOB_EXECUTION fetch job execution
6 BATCH_STEP_EXECUTION fetch step execution
7 BATCH_STEP_EXECUTION_CONTEXT fetch step execution context
8 BATCH_JOB_EXECUTION_CONTEXT fetch job execution context

In our environment, per-chunk latency increased from ~10ms (Spring Batch 5.x) to ~200ms (Spring Batch 6.x), caused by the full table scan on BATCH_JOB_EXECUTION_PARAMS.

Call chain:

TaskletStep (on chunk commit)
  → jobRepository.update(stepExecution)
    → SimpleJobRepository.update(StepExecution)
      → getStepExecution(id)                        // added in 6.x
        → JdbcStepExecutionDao
          → getJobExecutionId()                     // SQL #1
          → JdbcJobExecutionDao.getJobExecution()
            → getJobInstanceId()                    // SQL #2
            → getJobInstance()                      // SQL #3
            → getJobParameters()                    // SQL #4 ← BATCH_JOB_EXECUTION_PARAMS
            → GET_EXECUTION_BY_ID                   // SQL #5
          → getStepExecution()                      // SQL #6
        → fillStepExecutionDependencies()
          → getExecutionContext(stepExecution)       // SQL #7
        → getExecutionContext(jobExecution)          // SQL #8

Environment

Steps to reproduce

  1. Use Spring Batch 6.x with an Oracle database
  2. Accumulate a significant number of historical job executions (so BATCH_JOB_EXECUTION_PARAMS has many rows)
  3. Run a chunk-oriented job with a large number of chunks
  4. Observe per-chunk latency via SQL trace logs (logging.level.org.springframework.jdbc=TRACE)

Expected behavior

update(StepExecution) should complete in a similar time as Spring Batch 5.x (~10ms per chunk).

Minimal Complete Reproducible example

// Note: this is a pseudo-code to illustrate the scenario

@Test void updateStepExecutionShouldNotQueryJobExecutionParamsPerChunk() { // Given: a StepExecution with many rows in BATCH_JOB_EXECUTION_PARAMS // (simulate by inserting historical job execution params rows)

// When: jobRepository.update(stepExecution) is called (as happens on each chunk commit)
long start = System.currentTimeMillis();
for (int i = 0; i < 100; i++) {
    jobRepository.update(stepExecution);
}
long elapsed = System.currentTimeMillis() - start;

// Then: should complete quickly, not proportional to BATCH_JOB_EXECUTION_PARAMS table size
assertThat(elapsed).isLessThan(1000);

}

Suggestion

Adding an index on JOB_EXECUTION_ID in BATCH_JOB_EXECUTION_PARAMS would mitigate the full table scan issue on Oracle. It would be helpful if the official schema scripts (schema-oracle.sql, etc.) included this index, or at least if the documentation mentioned it as a recommendation for production environments.

CREATE INDEX BATCH_JOB_EXEC_PARAMS_IDX ON BATCH_JOB_EXECUTION_PARAMS(JOB_EXECUTION_ID);