Spring Batch Hibernate Example (original) (raw)

This article is a tutorial about Spring Batch with Hibernate. We will use Spring Boot to speed our development process.

1. Introduction

Spring Batch is a lightweight, scale-able and comprehensive batch framework to handle data at massive scale. Spring Batch builds upon the spring framework to provide intuitive and easy configuration for executing batch applications. Spring Batch provides reusable functions essential for processing large volumes of records, including cross-cutting concerns such as logging/tracing, transaction management, job processing statistics, job restart, skip and resource management.

Spring Batch has a layered architecture consisting of three components:

Let us dive into spring batch with a simple example of reading persons from a CSV file and loading them into embedded HSQL Database. Since we use the embedded database, data will not be persisted across sessions.

2. Technologies Used

3. Spring Batch Project

Spring Boot Starters provides more than 30 starters to ease the dependency management for your project. The easiest way to generate a Spring Boot project is via Spring starter tool with the steps below:

A Gradle Project will be generated. If you prefer Maven, use Maven instead of Gradle before generating the project. Import the project into your Java IDE.

3.1 Gradle File

Below we can see the generated build file for our project.

build.gradle

buildscript { ext { springBootVersion = '2.0.0.RELEASE' } repositories { mavenCentral() } dependencies { classpath("org.springframework.boot:spring-boot-gradle-plugin:${springBootVersion}") } } apply plugin: 'java' apply plugin: 'eclipse' apply plugin: 'idea' apply plugin: 'org.springframework.boot' apply plugin: 'io.spring.dependency-management' group = 'com.JCG' version = '0.0.1-SNAPSHOT' sourceCompatibility = 1.8 repositories { mavenCentral() } dependencies { compile('org.springframework.boot:spring-boot-starter-batch') compile('org.springframework.boot:spring-boot-starter-data-jpa') runtime('org.hsqldb:hsqldb') compile "org.projectlombok:lombok:1.16.8" testCompile('org.springframework.boot:spring-boot-starter-test') testCompile('org.springframework.batch:spring-batch-test') }

3.2 Data file

Sample CSV

FirstName,LastName Jill,Doe Joe,Doe Justin,Doe Jane,Doe John,Doe

3.3 Spring Batch Configuration

Below we will cover the Java configuration for Spring Boot, Batch and Hibernate. We will discuss each part of the configuration below.

Application Class

package com.JCG;

import org.springframework.boot.SpringApplication; import org.springframework.boot.autoconfigure.SpringBootApplication;

@SpringBootApplication public class Application {

public static void main(String[] args) {
    SpringApplication.run(Application.class, args);
}

}

Batch Configuration

package com.JCG.config;

import com.JCG.model.Person; import com.JCG.model.PersonRepository; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import org.springframework.batch.core.*; import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing; import org.springframework.batch.core.configuration.annotation.JobBuilderFactory; import org.springframework.batch.core.configuration.annotation.StepBuilderFactory; import org.springframework.batch.core.launch.support.RunIdIncrementer; import org.springframework.batch.item.ItemProcessor; import org.springframework.batch.item.database.JpaItemWriter; import org.springframework.batch.item.file.FlatFileItemReader; import org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper; import org.springframework.batch.item.file.mapping.DefaultLineMapper; import org.springframework.batch.item.file.transform.DelimitedLineTokenizer; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; import org.springframework.core.io.ClassPathResource;

import javax.persistence.EntityManagerFactory;

@Configuration @EnableBatchProcessing public class BatchConfiguration { @Autowired public JobBuilderFactory jobBuilderFactory;

@Autowired
public StepBuilderFactory stepBuilderFactory;

@Autowired
EntityManagerFactory emf;

@Autowired
PersonRepository personRepository;

private static final Logger log = LoggerFactory.getLogger(BatchConfiguration.class);

@Bean
public FlatFileItemReader reader() {
    FlatFileItemReader reader = new FlatFileItemReader<>();
    reader.setResource(new ClassPathResource("sample-data.csv"));
    reader.setLinesToSkip(1);

    DefaultLineMapper lineMapper = new DefaultLineMapper<>();
    DelimitedLineTokenizer tokenizer = new DelimitedLineTokenizer();
    tokenizer.setNames("firstName", "lastName");

    BeanWrapperFieldSetMapper fieldSetMapper = new BeanWrapperFieldSetMapper<>();
    fieldSetMapper.setTargetType(Person.class);

    lineMapper.setFieldSetMapper(fieldSetMapper);
    lineMapper.setLineTokenizer(tokenizer);
    reader.setLineMapper(lineMapper);

    return reader;
}


@Bean
public JpaItemWriter writer() {
    JpaItemWriter writer = new JpaItemWriter();
    writer.setEntityManagerFactory(emf);
    return writer;
}

@Bean
public ItemProcessor<Person, Person> processor() {
    return (item) -> {
        item.concatenateName();
        return item;
    };
}

@Bean
public Job importUserJob(JobExecutionListener listener) {
    return jobBuilderFactory.get("importUserJob")
            .incrementer(new RunIdIncrementer())
            .listener(listener)
            .flow(step1())
            .end()
            .build();
}

@Bean
public Step step1() {
    return stepBuilderFactory.get("step1")
            .<Person, Person>chunk(10)
            .reader(reader())
            .processor(processor())
            .writer(writer())
            .build();
}

@Bean
public JobExecutionListener listener() {
    return new JobExecutionListener() {


        @Override
        public void beforeJob(JobExecution jobExecution) {
            /**
             * As of now empty but can add some before job conditions
             */
        }

        @Override
        public void afterJob(JobExecution jobExecution) {
            if (jobExecution.getStatus() == BatchStatus.COMPLETED) {
                log.info("!!! JOB FINISHED! Time to verify the results");
                personRepository.findAll().
                        forEach(person -> log.info("Found <" + person + "> in the database."));
            }
        }
    };
}

}

Lines 25 indicates that it is a configuration class and should be picked up by spring boot to wire up the beans and dependencies. Line 26 is used to enable batch support for our application. Spring defines a Job which contains multiple Step to be executed. In our example, we use only a single step for our importUserJob. We use a JobExecutionListener to track the job execution which we will cover below. A Step could be a TaskletStep(contains a single function for execution) or Step which includes a Reader, Processor and Writer. In the above example, We have used Step.

3.3.1 Reader

Lines 42-60 include our reader configuration. We use FlatFileItemReader to read from our CSV file. The advantage of using an inbuilt reader is that it handles application failures gracefully and will support restarts. It can also skip lines during errors with a configurable skip limit.

It needs the following parameters to successfully read the file line by line.

Reader reads the items in the chunk(10) which is specified by the chunk config in line 91.

3.3.2 Processor

Spring does not offer an inbuilt processor and is usually left to the custom implementation. Here, We are using a lambda function to transform the incoming Person object. We call the concatenateName function to concatenate the first name and last name. We return the modified item to the writer. Processor does its execution one item at a time.

3.3.3 Writer

Here, we are using JpaItemWriter to write the model object into the database. JPA uses hibernate as the persistence provider to persist the data. The writer just needs the model to be written to the database. It aggregates the items received from the processor and flushes the data.

3.3.4 Listener

JobExecutionListener offers the methods beforeJob to execute before the job starts and afterJob which executes after job has been completed. Generally, these methods are used to collect various job metrics and sometimes initialize constants. Here, we use afterJob to check whether the data got persisted. We use a repository method findAll to fetch all the persons from our database and display it.

3.4 Model/Hibernate configuration

application.properties

spring.jpa.hibernate.ddl-auto=create-drop spring.jpa.show-sql=true

Here we specified that tables should be created before use and destroyed when the application terminates. Also, we have specified configuration to show SQL ran by hibernate in the console for debugging. Rest of the configuration of wiring Datasource to hibernate and then in turn to JPA EntityManagerfactory is handled by JpaRepositoriesAutoConfiguration and HibernateJpaAutoConfiguration.

Model Class(Person)

package com.JCG.model;

import lombok.*;

import javax.persistence.Entity; import javax.persistence.GeneratedValue; import javax.persistence.Id; import javax.persistence.Transient;

@Entity @Getter @Setter @NoArgsConstructor @RequiredArgsConstructor @ToString(exclude={"firstName","lastName"}) public class Person {

@Id
@GeneratedValue
private int id;

@Transient
@NonNull
private String lastName;

@Transient
@NonNull
private String firstName;

@NonNull
private String name;

public void concatenateName(){
    this.setName(this.firstName+" "+this.lastName);
}

}

A model class should be annotated with Entity to be utilized by spring container. We have used Lombok annotations to generate getter, setter and Constructor from our fields. Fields firstName and lastName are annotated as Transient to indicate that these fields should not be persisted to the database. There is an id field which is annotated to generate the hibernate sequence while saving to the database.

Repository Class(PersonRepository)

package com.JCG.model;

import org.springframework.data.jpa.repository.JpaRepository; import org.springframework.stereotype.Repository;

@Repository public interface PersonRepository extends JpaRepository<Person,Integer> { }

This is just a repository implementation of Spring JPA repository. For a detailed example refer JPA Repository example.

4. Summary

Run the Application class from a Java IDE. Output similar to the below screenshot will be displayed. In this example, we saw a simple way to configure a Spring Batch Project Application.

SpringBatchHibernate logs

SpringBatchHibernate Logs

5. Download the Source Code

Download
You can download the full source code of this example here: SpringBatchHibernate

Photo of Rajagopal ParthaSarathi

Rajagopal works in software industry solving enterprise-scale problems for customers across geographies specializing in distributed platforms. He holds a masters in computer science with focus on cloud computing from Illinois Institute of Technology. His current interests include data science and distributed computing.