如何使用Spring Batch仅解析XML的选定部分并将其转换为Java POJO

时间:2019-06-21 13:04:09

标签: java xml spring-boot spring-batch

我正在尝试解析XML文件并从中转换Java POJO。我的样本XML看起来像

students.xml

<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?>
<group>
    <college>
        <name>Hogwards</name>
        <city>Unknown</city>
    </college>
    <student>
        <name>Tony Tester</name>
        <rollNo>1</rollNo>
        <enrollmentDate>2016-10-31</enrollmentDate>
        <sampleTimeStamp>2016-11-07T05:50:45</sampleTimeStamp>
        <salary>16.57</salary>
    </student>
    <student>
        <name>Nick Newbie</name>
        <rollNo>2</rollNo>
        <enrollmentDate>2017-10-31</enrollmentDate>
        <sampleTimeStamp>2016-11-07T05:50:45</sampleTimeStamp>
        <salary>29.68</salary>
    </student>
    <student>
        <name>Ian Intermediate</name>
        <rollNo>3</rollNo>
        <enrollmentDate>2018-10-31</enrollmentDate>
        <sampleTimeStamp>2016-11-07T05:50:45</sampleTimeStamp>
        <salary>789.62</salary>
    </student>
</group>

在这里,我的目标是解析文件,并使用spring batch将学生信息填充到数据库中,出于我的目的,大学信息对我来说是一种标题,这完全没用,因此在我的批处理读取器中,我想忽略它而我只想将学生信息解析为大块。到目前为止,我的代码正在使用GroupDTO类一次解析整个记录,并一次在对象上创建对象,因此,我无法使用spring batch的功能。我的要求是说,学生信息应该以大块的形式进行解析,比方说以300左右的大块为单位。但是到目前为止,我的代码一次解析了整个XML文件,并从中填充Java对象。请帮助我忽略大学部分,而仅使用Spring Batch将学生部分分解成小块,或者提出一些适当的链接,这可能有助于我为我的问题找到解决方案。预先感谢...

XmlConfiguration.java

@Configuration
public class XmlConfiguration 
{

    @Autowired
    JobBuilderFactory jobBuilderFactory;

    @Autowired
    StepBuilderFactory stepBuilderFactory;

    @StepScope
    @Bean(name="xmlReader")
    public SynchronizedItemStreamReader<GroupDTO> reader() 
    {
        StaxEventItemReader<GroupDTO> xmlFileReader = new StaxEventItemReader<>();
        xmlFileReader.setResource(new ClassPathResource("students.xml"));
        xmlFileReader.setFragmentRootElementName("group");

        Map<String, Class<?>> aliases = new HashMap<>();
        aliases.put("group", GroupDTO.class);
        aliases.put("college", CollegeDTO.class);
        aliases.put("student", StudentDTO.class);

        XStreamMarshaller xStreamMarshaller = new XStreamMarshaller();
        xStreamMarshaller.setAliases(aliases);

        String dateFormat = "yyyy-MM-dd";
        String timeFormat = "HHmmss";
        String[] acceptableFormats = {timeFormat};

        xStreamMarshaller.getXStream().autodetectAnnotations(true);
        xStreamMarshaller.getXStream().registerConverter(new DateConverter(dateFormat, acceptableFormats));


        xStreamMarshaller.getXStream().addPermission(NoTypePermission.NONE);
        xStreamMarshaller.getXStream().addPermission(NullPermission.NULL);
        xStreamMarshaller.getXStream().addPermission(PrimitiveTypePermission.PRIMITIVES);
        xStreamMarshaller.getXStream().allowTypeHierarchy(Collection.class);
        xStreamMarshaller.getXStream().allowTypesByWildcard(new String[] {"com.example.demo.**"});

        xStreamMarshaller.getXStream().addImplicitCollection(GroupDTO.class, "list");          

        xmlFileReader.setUnmarshaller(xStreamMarshaller);      

        SynchronizedItemStreamReader<GroupDTO> synchronizedItemStreamReader = new SynchronizedItemStreamReader<>();
        synchronizedItemStreamReader.setDelegate(xmlFileReader);
        return synchronizedItemStreamReader;
    } 

    @Bean(name="xmlProcessor")
    public ItemProcessor<GroupDTO, GroupDTO> processor() 
    {
        return new Processor();
    }

    @Bean(name="xmlWriter")
    public ItemWriter<GroupDTO> writer() 
    {
        return new Writer();     
    }

    @Bean(name="xmljobListener")
    public JobExecutionListenerSupport jobListener() 
    {
        return new JobListener();
    }

    @JobScope
    @Bean(name="xmltaskExecutor")   
    public ThreadPoolTaskExecutor taskExecutor() 
    {
        ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
        executor.setCorePoolSize(50);
        executor.setMaxPoolSize(100);
        return executor;
    }

    @Bean(name="xmlStep")
    public Step xmlFileToDatabaseStep() 
    {
        return stepBuilderFactory.get("xmlStep")
                .<GroupDTO, GroupDTO>chunk(2)
                .reader(this.reader())
                .processor(this.processor())
                .writer(this.writer())
                .taskExecutor(this.taskExecutor())
                .build();
    }

    @Bean(name="xmlJob")
    public Job xmlFileToDatabaseJob(@Autowired @Qualifier("xmlStep") Step step) 
    {
        return jobBuilderFactory
                .get("xmlJob"+new Date())
                .incrementer(new RunIdIncrementer())
                .listener(this.jobListener())
                .flow(step)
                .end()
                .build();
    }

}

GroupDTO.java

@XStreamAlias("group")
public class GroupDTO 
{
    @XStreamAlias("college")
    private CollegeDTO college;

    @XStreamAlias("student")
    private List<StudentDTO> list;

       ...... getter,setter, constructors
}

CollegeDTO.java

public class CollegeDTO 
{
    private String name;
    private String city;
        ...... getter,setter and constructor
}

StudentDTO.java

public class StudentDTO 
{
    private String name;
        private Integer rollNo;    
        private Date enrollmentDate;
        private Date sampleTimeStamp;
        private BigDecimal salary;
        ... getter, setter and constructor
}

1 个答案:

答案 0 :(得分:0)

在作业中,您具有可以使用块标记的tasklet。它具有读取器和写入器属性,并且可以具有处理器属性。 “处理器”是可选的。

   <batch:job id="helloWorldJob">
      <batch:step id="step1">
         <batch:tasklet>
             <batch:chunk reader="itemReader" writer="itemWriter"
                    processor="itemProcessor" commit-interval="10">
             </batch:chunk>
         </batch:tasklet>
      </batch:step>
   </batch:job>  

然后,当您声明reader标签时,您将定义映射器。

<!--  READER -->
<bean id = "itemReader" 
    class = "org.springframework.batch.item.file.FlatFileItemReader">  
   ...
   <property name = "lineMapper"> 
      <bean class = "org.springframework.batch.item.file.mapping.DefaultLineMapper"> 
          ...
          <property name = "fieldSetMapper"> 
             <bean class = "tudy.batch.Mapper" /> 
          </property> 
       </bean> 
    </property> 
 </bean> 

此Mapper类是执行所需操作的理想选择。该映射器将读取输入文件。我想您需要做的就是忽略大学标签。

public class Mapper implements FieldSetMapper<Student>{

    public Student mapFieldSet(FieldSet fieldSet) throws BindException {

        // Instantiating the report object
        Student student = new Student();

        // Setting the fields
        student.setName(fieldSet.readInt(0));
        student.setRollNo(fieldSet.readString(1));
        student.setEnrollmentDate(fieldSet.readString(2));
        student.setSampleTimeStamp(fieldSet.readString(3));
        student.setSalary(fieldSet.readString(4));

        return Student;
    }
}

您可以使用索引或名称。您应该调试代码,并确认职位或学校将如何忽略它的名称。