Question

我使用Spring Batch处理大型XML文件（约2百万个实体）并更新数据库。这个过程非常耗时，所以我尝试使用分区来加快处理速度。

我追求的方法是将较大的xml文件拆分为较小的文件（比如每500个实体），然后使用Spring Batch并行处理每个文件。我正在努力使用Java配置来并行处理多个xml文件。这些是我配置的相关bean

@Bean
public Partitioner partitioner(){
    MultiResourcePartitioner partitioner = new MultiResourcePartitioner();

    Resource[] resources;
    try {
        resources = resourcePatternResolver.getResources("file:/tmp/test/*.xml");
    } catch (IOException e) {
        throw new RuntimeException("I/O problems when resolving the input file pattern.",e);
    }
    partitioner.setResources(resources);
    return partitioner;
}

@Bean
public Step partitionStep(){
    return stepBuilderFactory.get("test-partitionStep")
            .partitioner(personStep())
            .partitioner("personStep", partitioner())
            .taskExecutor(taskExecutor())
            .build();
}

@Bean
public Step personStep() {
    return stepBuilderFactory.get("personStep")
            .<Person, Person>chunk(100)
            .reader(personReader())
            .processor(personProcessor())
            .writer(personWriter)
            .build();
}

@Bean
public TaskExecutor taskExecutor() {
    SimpleAsyncTaskExecutor asyncTaskExecutor = new SimpleAsyncTaskExecutor("spring_batch");
    asyncTaskExecutor.setConcurrencyLimit(10);
    return asyncTaskExecutor;
}

当我执行作业时，我会得到不同的 XML解析错误（每次都是不同的）。如果我从文件夹中删除所有xml文件但只有一个，则处理按预期工作。

我不确定我100％理解Spring Batch分区的概念，尤其是＆＃34; slave＆＃34;一部分。

谢谢！

Spring Batch同时处理多个文件

0 个答案: