重复分区步骤会导致巨大的性能延迟

时间:2017-11-02 11:06:36

标签: java spring-batch

以下是我的工作流程。

步骤1:下载大文件 - 超过5 GB
第2步:将文件拆分为小文件

步骤3:处理文件 - 使用partitioner和taskExecutor处理拆分文件

重复步骤1到3,直到满足某些条件 - 通常重复超过10k

批量开始表现非常好。但随着时间的推移,表演性能会下降 注意 - 数据处理不是瓶颈

我怀疑重复分区步骤是为每次重复创建新线程。

以下是我的配置

@Bean
public Job myjob(JobBuilderFactory jobs) throws Exception {
    return jobs.get("myjob")                
            .start(DownloadStep())
            .next(master()).on(CONTINUE_CONDITION .next(DownloadStep())
            .next(master()).on(STOP_CONDITION).to(cleanUpStep())                
            .build();
}

@Bean
@StepScope
public Partitioner partitioner() {
    MultiResourcePartitioner multiResourcePartitioner = new MultiResourcePartitioner();
    ClassLoader cl = this.getClass().getClassLoader();
    ResourcePatternResolver resolver = new PathMatchingResourcePatternResolver(cl);
    Resource[] resources = resolver.getResources("file:" some file path);       
    multiResourcePartitioner.setResources(resources);
    multiResourcePartitioner.partition(10);     
    return multiResourcePartitioner;
}


@Bean
public TaskExecutor taskExecutor() {
    ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();     
    taskExecutor.setMaxPoolSize(30);        
    return taskExecutor;
}   

@Bean
@Qualifier("master")
public Step master() {
    return stepBuilderFactory.get("master")
            .partitioner(process())
            .partitioner("process",partitioner) 
            .taskExecutor(taskExecutor())           
            .build();
}     

更新根据Michael Minella的建议,我将Spring Batch版本更新为4.0.0.RC1但性能没有改善

重复150次以上批次创建分区步骤超过15分钟。我为每个文件创建了16个分区。

2017-12-05 17:00:51,660  INFO  [THREAD ID=main] XXXXXConfiguration. - Resource files: 16
2017-12-05 17:11:20,923  INFO   [THREAD ID=taskExecutor-17]  XXXProcessor. - XXXXListener beforeStep StepExecution: id=841, version=1, name=XXXStep:partition14, st
atus=STARTED, exitStatus=EXECUTING, readCount=0, filterCount=0, writeCount=0 readSkipCount=0, writeSkipCount=0, processSkipCount=0, commitCount=0, rollbackCount=0

0 个答案:

没有答案