Core Concepts
Jobs, Steps, ItemReader, ItemProcessor, ItemWriter - the building blocks of Spring Batch
Job
A Job is the top-level container for batch processing. It represents the entire batch process that you want to run. A Job is composed of one or more Steps.
JobInstance
A logical run of a Job. Each unique combination of Job + JobParameters creates a new JobInstance. Cannot be run twice with same parameters.
JobExecution
A single attempt to run a JobInstance. If a job fails, you can restart it - creating a new JobExecution for the same JobInstance.
@ConfigurationpublicclassBatchConfig{@BeanpublicJobimportUserJob(JobRepository jobRepository,Step step1,Step step2){returnnewJobBuilder("importUserJob", jobRepository).start(step1)// First step.next(step2)// Second step.build();}}Step
A Step is an independent, sequential phase of a batch job. Each Step can be a simple Tasklet or a chunk-oriented processing step with Reader, Processor, and Writer.
Tasklet-based Step
Simple, single operation. Execute once and complete. Good for: deleting files, running SQL, sending notifications.
Chunk-based Step
Read → Process → Write pattern. Processes data in chunks. Good for: ETL jobs, file processing, data migration.
// Tasklet-based Step (simple, one-time operation)@BeanpublicStepcleanupStep(JobRepository jobRepository,PlatformTransactionManager transactionManager){returnnewStepBuilder("cleanupStep", jobRepository).tasklet((contribution, chunkContext)->{// Clean up old files, send notification, etc.System.out.println("Cleanup completed!");returnRepeatStatus.FINISHED;}, transactionManager).build();}// Chunk-based Step (read-process-write)@BeanpublicStepprocessUsersStep(JobRepository jobRepository,PlatformTransactionManager transactionManager,ItemReader<User> reader,ItemProcessor<User,User> processor,ItemWriter<User> writer){returnnewStepBuilder("processUsersStep", jobRepository).<User,User>chunk(100, transactionManager)// Process 100 items per transaction.reader(reader).processor(processor).writer(writer).build();}💡 Chunk Size
The chunk size determines how many items are processed before committing a transaction. Larger chunks = fewer commits but more memory. Start with 100-1000 and tune based on performance.
ItemReader
ItemReader is responsible for reading data from a source. It reads one item at a time and returns null when there's no more data.
publicinterfaceItemReader<T>{// Returns the next item, or null if no more itemsTread()throwsException;}// Example: Reading from a CSV file@BeanpublicFlatFileItemReader<User>userReader(){returnnewFlatFileItemReaderBuilder<User>().name("userReader").resource(newClassPathResource("users.csv")).delimited().names("id","name","email").targetType(User.class).build();}ItemProcessor
ItemProcessor transforms or validates data between reading and writing. It's optional - you can skip it if no transformation is needed.
Transform
Convert data format
Validate
Check business rules
Filter
Return null to skip
publicinterfaceItemProcessor<I,O>{// Transform input I to output O// Return null to filter/skip the itemOprocess(I item)throwsException;}// Example: Transform and validate users@BeanpublicItemProcessor<User,User>userProcessor(){return user ->{// Skip inactive usersif(!user.isActive()){returnnull;// Returning null = skip this item}// Transform: uppercase the name
user.setName(user.getName().toUpperCase());// Enrich: add processed timestamp
user.setProcessedAt(LocalDateTime.now());return user;};}ItemWriter
ItemWriter writes processed items to a destination. Unlike ItemReader which reads one at a time, ItemWriter receives a list of items (the entire chunk).
publicinterfaceItemWriter<T>{// Write a chunk of itemsvoidwrite(Chunk<?extendsT> chunk)throwsException;}// Example: Writing to database@BeanpublicJdbcBatchItemWriter<User>userWriter(DataSource dataSource){returnnewJdbcBatchItemWriterBuilder<User>().sql("INSERT INTO users (id, name, email, processed_at) "+"VALUES (:id, :name, :email, :processedAt)").dataSource(dataSource).beanMapped().build();}JobRepository
JobRepository is the persistence mechanism for all batch metadata. It stores information about jobs, steps, and their executions - enabling restart and monitoring.
| Table | Purpose |
|---|---|
BATCH_JOB_INSTANCE | Unique job + parameters combination |
BATCH_JOB_EXECUTION | Each run attempt of a job |
BATCH_STEP_EXECUTION | Each step's execution details |
BATCH_JOB_EXECUTION_CONTEXT | Job-level state for restart |
BATCH_STEP_EXECUTION_CONTEXT | Step-level state for restart |