Search This Blog

Monday, 6 April 2015

Multiple Steps in a Job

In the previous post we saw how to execute a job with a single step. A job can be composed of multiple steps. So I decided to extend the previous job to include two steps.
  1. Step 1 is same as before - read csv -> process -> write to List.
  2. Step 2 is read from List -> process -> write to CSV file.
To read from the List I introduced a new class :
public class MemItemReader extends ListItemReader<Person> {

   @SuppressWarnings("unchecked")
   public MemItemReader(ListItemWriter<Person> writer) {
      super((List<Person>) writer.getWrittenItems());

   }

}
The class simply picks up the records from the ListItemWriter. For the processing - I did not want to to do anything complex - so I simply transformed the Person object into a BetterPerson instance:
public class BetterPerson extends Person {
   private String title;
   //setter getters
}
public class BetterPersonItemProcessor implements ItemProcessor<Person, BetterPerson> {

   public BetterPerson process(final Person person) throws Exception {

      final BetterPerson betterPerson = new BetterPerson();

      betterPerson.setFirstName(person.getFirstName());
      betterPerson.setLastName(person.getLastName());
      betterPerson.setTitle("??");
      System.out.println("Converting (" + person + ") into ("
      + betterPerson + ")");
      return betterPerson;
   }

}
Now to configure the new step:
<job id="reportJob" job-repository="jobRepository">
      <step id="step1" next="step2">
         <tasklet>
            <chunk reader="itemReader" writer="itemWriter"
               processor="personProcessor" commit-interval="2" />
         </tasklet>
      </step>
      <step id="step2">
         <tasklet>
            <chunk reader="memItemReader" writer="cvsFileItemWriter"
               processor="betterPersonProcessor" commit-interval="2" />
         </tasklet>
      </step>
   </job>
As seen here we have step with id step1 followed by step2. The next attribute indicates the link. If the attribute is not specified than we get an error:
Exception in thread "main" org.springframework.beans.factory.parsing.BeanDefinitionParsingException:
 Configuration problem: The element [step2] is unreachable 
Offending resource: class path resource [spring-config.xml]
The configuration for the new beans is as below:
<beans:bean id="memItemReader" class="org.robin.learn.sb.batch.MemItemReader"
      scope="step">
      <beans:constructor-arg ref="itemWriter" />
   </beans:bean>


   <beans:bean id="betterPersonProcessor"
      class="org.robin.learn.sb.batch.BetterPersonItemProcessor" />

   <beans:bean id="cvsFileItemWriter"
      class="org.springframework.batch.item.file.FlatFileItemWriter">
      <beans:property name="resource" value="file:opt.csv" />
      <beans:property name="shouldDeleteIfExists" value="true" />
      <beans:property name="lineAggregator">
         <beans:bean
            class="org.springframework.batch.item.file.transform.DelimitedLineAggregator">
            <beans:property name="delimiter" value="," />
            <beans:property name="fieldExtractor">
               <beans:bean
                  class="org.springframework.batch.item.file.transform.BeanWrapperFieldExtractor">
                  <beans:property name="names"
                     value="title, firstName,lastName" />
               </beans:bean>
            </beans:property>
         </beans:bean>
      </beans:property>
   </beans:bean>
The cvsFileItemWriter was lifted from this example. It includes a delimiter property - so we can easily generate pipe, hyphen or even a space separated file. The fieldExtractor property gives the sequence in which the fields must be read to write the data.
For the first bean "memItemReader", I have specified the scope. This is a previously unseen scope that has been introduced for Spring batch. If the memItemReader bean were created in the beginning it would be initialized with the empty list of the itemWriter bean. What we need is that bean is created after the first step has executed. So we bind this bean to a step scope. The bean is now created before the step execution:
15:21:16.894 [main] INFO  o.s.batch.core.job.SimpleStepHandler - Executing step: [step2].
...
15:21:16.904 [main] DEBUG o.s.batch.core.scope.StepScope - Creating object in scope=step, 
name=scopedTarget.memItemReader
15:21:16.904 [main] DEBUG o.s.b.f.s.DefaultListableBeanFactory - Creating instance of bean 
'scopedTarget.memItemReader'
15:21:16.904 [main] DEBUG o.s.b.f.s.DefaultListableBeanFactory - Returning cached instance 
of singleton bean 'itemWriter'
As seen here, as the bean is created after Step1, there is data in the itemWriter which is made available to the MemItemReader. On running the code now a csv file is created on the disk with information as
??,ROBIN,VARGHESE
??,ROHAN,NAIDU
??,ROMAN,BARLAN

1 comment:

  1. is it possible to create multiple jobs with same steps in the xml ?
    By Arun

    ReplyDelete