Missing documentation about transaction management in multi-threaded steps (original) (raw)

Bug description
When a ChunkOrientedStep is configured with a TaskExecutor, the item processor is executed outside any transaction. This is a regression from Spring Batch 5.2.5 (where the processor ran inside the chunk transaction) and contradicts the documentation, which states:

chunk: The Java-specific name of the dependency that indicates that this is an item-based step and 
the number of items to be processed before the transaction is committed.

Environment
Migrating from Spring Boot 3.5.13 / Spring Batch 5.2.5 (working) to Spring Boot 4.0.5 / Spring Batch 6.0.3 (broken)
Java 25, PostgreSQL

Steps to reproduce
Clone https://github.com/pkernevez/pb-hibernate-proxy/tree/springbatch-transaction-issue (/!\ branch: springbatch-transaction-issue) and run IssueSpringBatchTest.
The test fails because the processor is annotated with @transactional(TxType.MANDATORY) and no transaction is active when it runs.
Commenting out .taskExecutor(executor) in BatchConfiguration makes the test pass.
To reproduce from scratch: build a ChunkOrientedStep with a chunk size, a TransactionManager, and a TaskExecutor, then annotate the processor with @transactional(TxType.MANDATORY).

        return new StepBuilder("currencyStep", jobRepository)
                .<Long, CurrencyEntity>chunk(10)
                .transactionManager(transactionManager)
//                .taskExecutor(executor)
                .reader(currencyReader)
                .processor(processor)
                .writer(writer)
                .build();

Expected behavior

Investigation
I can't understand the current implementation regarding SpringBatch 5, the documentation and my experience.
This is my understanding without scanMode.

Without a TaskExecutor:

  1. doExecute iterate on the chunk starts a transaction before each
  2. For each chunk we process processNextChunk that delegates to processChunkSequentially
  3. We iterate on the item and process them in the transaction started in 1
    All is ok.

With a TaskExecutor:

  1. doExecute iterate on the chunk starts a transaction before each in the main thread
  2. For each chunk we process processNextChunk that delegates to processChunkConcurrently
  3. We iterate on the item and push them in the task executor.
  4. They are all executed in other threads without transations ❌
  5. Wait for all the item execution before returning to 1 for the next chunk

Consequences

Proposed implementation

  1. doExecute iterate on the chunk starts a transaction before each in the main thread
  2. For each chunk we process processNextChunk that delegates to processChunkConcurrently
  3. We prepare the chunk to be executed in another thread, and push them in the task executor.
  4. In the main thread we directly return to 1 to prepare the next chunk to push in the executor
    4bis. In the thread of the task executor, we start a new transaction then iterate on the item of the chunk
  5. Wait in 'main-thread` for the completion of all the chunk/Future

Minimal Complete Reproducible example
Clone the repo https://github.com/pkernevez/pb-hibernate-proxy/tree/springbatch-transaction-issue
/!\ the right branch is springbatch-transaction-issue