Spring Batch Step by Step Example (original) (raw)
Spring Batch is a lightweight, comprehensive batch framework designed to enable the development of the robust batch applications vital for the daily operations of the enterprise systems. In this post, we will create a simple Spring batch tutorial to read the data from the CSV
to an XML
file.
1. Introduction
1.1 Spring Framework
- Spring is an open-source framework created to address the complexity of an enterprise application development
- One of the chief advantages of the Spring framework is its layered architecture, which allows developers to be selective about which of its components they can use while providing a cohesive framework for
J2EE
application development - Spring framework provides support and integration to various technologies for e.g.:
- Support for Transaction Management
- Support for interaction with the different databases
- Integration with the Object Relationship frameworks for e.g. Hibernate, iBatis etc
- Support for Dependency Injection which means all the required dependencies will be resolved with the help of containers
- Support for
REST
style web-services
1.2 Spring Batch
- Spring Batch is a lightweight, comprehensive batch framework designed to enable the development of the robust batch applications vital for the daily operations of the enterprise systems
- Spring Batch provides the reusable functions that are essential in processing large volumes of records, including logging/tracing, transaction management, job processing statistics, job restart, skip, and resource management
- It also provides the advanced technical services and features that will enable the extremely high-volume and high-performance batch jobs through the optimization and partitioning techniques
1.2.1 How Spring Batch Works?
A Spring Batch Job consists of the following components:
- Job: A Job represents the Spring Batch job. Each job can have one or more steps
- Step: A Step that delegates to a Job to do its work. This is a great tool for managing the dependencies between the jobs, and also to modularize the complex step logic into something that is testable in the isolation. The job is executed with parameters that can be extracted from the step execution, hence this step can also be usefully used as the worker in a parallel or partitioned execution
- ItemReader: It is a strategy interface for providing the data. The implementation here is expected to be stateful and it will be called multiple times for each batch. Each one can call to the
read()
method that will return a different value and finally returning thenull
when all input data is exhausted - ItemProcessor: It is an interface for item transformations. Given an item as an input, this interface provides an extension point which allows the application to implement its business logic in an item oriented processing scenario
- ItemStreamWriter: It is an interface for the generic output operations. The class implementing this interface will be responsible for serializing the objects as necessary. Generally, it is the responsibility of the implementing class to decide which technology to use for mapping and how it should be configured. The
write()
method is responsible for making sure that any internal buffers are flushed and if a transaction is active it will also be necessary to discard the output on a subsequent rollback. The resource to which the writer is sending the data should normally be able to handle this itself
The below figure illustrates the relationships between these concepts:
Fig. 1: Anatomy of a Spring Batch Job
1.2.2 How Spring Batch Can Help Us?
A Spring Batch provides the following features that help us to solve multiple problems:
- It helps developers to structure the code in a clean way by providing the infrastructure that is used to implement, configure, and run batch jobs
- It uses the chunk oriented processing where items are processed one by one and the transaction is committed when the chunk size is met. In other words, it provides developers an easy way to manage the size of the transactions
- It provides the proper error handling. For e.g., developers can skip items if an exception is thrown and configure the retry logic that is used to determine whether the batch job should retry the failed operation. Developers can also configure the logic that is used to decide whether or not our transaction is rolled back
- It writes the comprehensive logs in the database. These logs contain the metadata of each job execution and step execution, and developers can use it for the troubleshooting purposes
Now, open up the Eclipse IDE and let’s see how to implement the Spring Batch example!
2.1 Tools Used
We are using Eclipse Kepler SR2, JDK 8, MySQL and Maven. Having said that, we have tested the code against JDK 1.7 and it works well.
2.2 Project Structure
Firstly, let’s review the final project structure, in case you are confused about where you should create the corresponding files or folder later!
Fig. 2: Spring Batch Application Structure
2.3 Project Creation
This section will demonstrate on how to create a Java based Maven project with Eclipse. In Eclipse IDE, go to File -> New -> Maven Project
.
Fig. 3: Create Maven Project
In the New Maven Project window, it will ask you to select project location. By default, ‘Use default workspace location’ will be selected. Select the ‘Create a simple project (skip archetype selection)’ checkbox and just click on next button to proceed.
Fig. 4: Project Details
It will ask you to ‘Enter the group and the artifact id for the project’. We will input the details as shown in the below image. The version number will be by default: 0.0.1-SNAPSHOT
.
Fig. 5: Archetype Parameters
Click on Finish and the creation of a maven project is completed. If you observe, it has downloaded the maven dependencies and a pom.xml
file will be created. It will have the following code:
pom.xml
4.0.0 SpringBatch SpringBatch 0.0.1-SNAPSHOT jarWe can start adding the dependencies that developers want like Spring Core, Spring Context, Spring Batch etc. Let’s start building the application!
3. Application Building
Below are the steps involved in developing this application.
3.1 Maven Dependencies
Here we will specify the required dependencies and the rest dependencies will be automatically resolved by Maven. The updated file will have the following code:
pom.xml
4.0.0 SpringBatch SpringBatch 0.0.1-SNAPSHOT jar org.springframework spring-core 4.3.5.RELEASE org.springframework spring-context 4.3.5.RELEASE org.springframework.batch spring-batch-core 3.0.7.RELEASE org.springframework spring-oxm 3.2.2.RELEASE org.springframework spring-jdbc 4.3.5.RELEASE mysql mysql-connector-java 5.1.27 ${project.artifactId}
3.2 Java Class Creation
Let’s create the required Java files. Right-click on src/main/java
folder, New -> Package
.
Fig. 6: Java Package Creation
A new pop window will open where we will enter the package name as: com.jcg.spring.batch
.
Fig. 7: Java Package Name (com.jcg.spring.batch)
Once the package is created in the application, we will need to create the Model and the Implementation classes. Right-click on the newly created package: New -> Class
.
Fig. 8: Java Class Creation
A new pop window will open and enter the file name as: Report
. The POJO
model class will be created inside the package: com.jcg.spring.batch
.
Fig. 9: Java Class (Report.java)
Repeat the step (i.e. Fig. 8) and enter the filename as: CustomItemProcessor
.
Fig. 10: Java Class (CustomItemProcessor.java)
Again, repeat the step (i.e. Fig. 8) and enter the filename as: ReportFieldSetMapper
.
Fig. 11: Java Class (ReportFieldSetMapper.java)
To create the utility or the implementation class, repeat the step (i.e. Fig. 8) and enter the filename as AppMain
.
Fig. 12: Java Class (AppMain.java)
3.2.1 Implementation of Model Class
This is a simple class where we will map the CSV
values to the Report
object and write it to an XML
file. Add the following code to it:
Report.java
package com.jcg.spring.batch;
import java.math.BigDecimal; import java.util.Date;
import javax.xml.bind.annotation.XmlAttribute; import javax.xml.bind.annotation.XmlElement; import javax.xml.bind.annotation.XmlRootElement;
@XmlRootElement(name = "record") public class Report {
private int id;
private BigDecimal sales;
private int qty;
private String staffName;
private Date date;
@XmlAttribute(name = "id")
public int getId() {
return id;
}
public void setId(int id) {
this.id = id;
}
@XmlElement(name = "sales")
public BigDecimal getSales() {
return sales;
}
public void setSales(BigDecimal sales) {
this.sales = sales;
}
@XmlElement(name = "qty")
public int getQty() {
return qty;
}
public void setQty(int qty) {
this.qty = qty;
}
@XmlElement(name = "staffName")
public String getStaffName() {
return staffName;
}
public void setStaffName(String staffName) {
this.staffName = staffName;
}
public Date getDate() {
return date;
}
public void setDate(Date date) {
this.date = date;
}
@Override
public String toString() {
return "Report [Id?=" + id + ", Sales?=" + sales + ", Qty?=" + qty + ", Staff-name?=" + staffName + "]";
}
}
3.2.2 Implementation of Processor Class
This is a simple class which will be executed before the ItemWriter
class. Add the following code to it:
CustomItemProcessor.java
package com.jcg.spring.batch;
import org.springframework.batch.item.ItemProcessor;
public class CustomItemProcessor implements ItemProcessor<Report, Report> { public Report process(Report itemObj) throws Exception { System.out.println("Processing Item?= " + itemObj); return itemObj; } }
3.2.3 Implementation of Mapper Class
This class is used to convert the Date
to a custom FieldMapper
and is used to map the CSV
fields to the Report
class. Add the following code to it:
ReportFieldSetMapper.java
package com.jcg.spring.batch;
import java.text.ParseException; import java.text.SimpleDateFormat;
import org.springframework.batch.item.file.mapping.FieldSetMapper; import org.springframework.batch.item.file.transform.FieldSet; import org.springframework.validation.BindException;
public class ReportFieldSetMapper implements FieldSetMapper {
static Report reportObj;
private SimpleDateFormat dateFormatObj = new SimpleDateFormat("dd/MM/yyyy");
public Report mapFieldSet(FieldSet fieldSetObj) throws BindException {
reportObj = new Report();
reportObj.setId(fieldSetObj.readInt(0));
reportObj.setSales(fieldSetObj.readBigDecimal(1));
reportObj.setQty(fieldSetObj.readInt(2));
reportObj.setStaffName(fieldSetObj.readString(3));
String csvDate = fieldSetObj.readString(4);
try {
reportObj.setDate(dateFormatObj.parse(csvDate));
} catch (ParseException parseExceptionObj) {
parseExceptionObj.printStackTrace();
}
return reportObj;
}
}
3.2.4 Implementation of Utility Class
This class will get the bean from the context file (i.e. spring-beans.xml
) and calls the jobLauncherObj.run()
method to execute the job. Add the following code to it:
AppMain.java
package com.jcg.spring.batch;
import org.springframework.batch.core.Job; import org.springframework.batch.core.JobExecution; import org.springframework.batch.core.JobParameters; import org.springframework.batch.core.launch.JobLauncher; import org.springframework.context.ApplicationContext; import org.springframework.context.support.ClassPathXmlApplicationContext;
public class AppMain {
static Job jobObj;
static JobLauncher jobLauncherObj;
static ApplicationContext contextObj;
private static String[] springConfig = {"spring/batch/jobs/spring-beans.xml" };
public static void main(String[] args) {
// Loading The Bean Definition From The Spring Configuration File
contextObj = new ClassPathXmlApplicationContext(springConfig);
jobObj = (Job) contextObj.getBean("helloWorldJob");
jobLauncherObj = (JobLauncher) contextObj.getBean("jobLauncher");
try {
JobExecution execution = jobLauncherObj.run(jobObj, new JobParameters());
System.out.println("Exit Status : " + execution.getStatus());
} catch (Exception exceptionObj) {
exceptionObj.printStackTrace();
}
System.out.println("Done");
}
}
3.3 Configuration File
To configure the spring batch framework, developers need to implement a bean configuration, data-source and, a spring context file i.e. spring-beans.xml
, spring-database.xml
and, spring-context.xml
respectively. Right-click on SpringBatch/src/main/resources/spring/batch/config
folder, New -> Other
.
Fig. 13: XML File Creation
A new pop window will open and select the wizard as an XML
file.
Fig. 14: Wizard Selection
Again, a pop-up window will open. Verify the parent folder location as: SpringBatch/src/main/resources/spring/batch/config
and enter the file name as: spring-context.xml
. Click Finish.
Fig. 15: spring-context.xml
Once the XML
file is created, we will add the following code to it:
spring-context.xml
Repeat the step (i.e. Fig. 13) and enter the filename as: spring-datasource.xml
.
Fig. 16: spring-datasource.xml
Once the XML
file is created, we will add the following code to it:
spring-datasource.xml
Again repeat the step (i.e. Fig. 13) and enter the filename as: spring-beans.xml
.
Fig. 17: spring-beans.xml
Once the XML
file is created, we will add the following code to it:
spring-beans.xml
com.jcg.spring.batch.Report4. Run the Application
To run the application, right click on the AppMain
class, Run As -> Java Application
. Developers can debug the example and see what happens after every step. Enjoy!
Fig. 18: Run the Application
5. Project Demo
Running the above program as a Java application, the code shows the following status as output.
Fig. 19: Application Output
Developers can see that we have processed all the input records and the XML
file is found in the project/xml
folder.
That’s all for this post. Happy Learning!
6. Conclusion
This article has provided the introductory details of Spring Batch and helps developers understand the basic configuration required to achieve this. That’s all for this tutorial and I hope this article served you whatever you were looking for.
7. Download the Eclipse Project
This was an example of Spring Batch for beginners.
Download
You can download the full source code of this example here: SpringBatch
An experience full-stack engineer well versed with Core Java, Spring/Springboot, MVC, Security, AOP, Frontend (Angular & React), and cloud technologies (such as AWS, GCP, Jenkins, Docker, K8).