Spring Batch Partitioned Step在发生不可跳过的异常后数小时停止

时间:2014-05-09 08:44:29

标签: spring-batch


我想验证Spring Batch的行为......
当运行Job的分区步骤时,我遇到了这个异常:

org.springframework.batch.core.JobExecutionException: Partition handler returned an unsuccessful step
at org.springframework.batch.core.partition.support.PartitionStep.doExecute(PartitionStep.java:111)
at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:195)
at org.springframework.batch.core.job.SimpleStepHandler.handleStep(SimpleStepHandler.java:137)
at org.springframework.batch.core.job.flow.JobFlowExecutor.executeStep(JobFlowExecutor.java:64)
at org.springframework.batch.core.job.flow.support.state.StepState.handle(StepState.java:60)
at org.springframework.batch.core.job.flow.support.SimpleFlow.resume(SimpleFlow.java:152)
at org.springframework.batch.core.job.flow.support.SimpleFlow.start(SimpleFlow.java:131)
at org.springframework.batch.core.job.flow.FlowJob.doExecute(FlowJob.java:135)
at org.springframework.batch.core.job.AbstractJob.execute(AbstractJob.java:301)
at org.springframework.batch.core.launch.support.SimpleJobLauncher$1.run(SimpleJobLauncher.java:134)
at org.springframework.core.task.SyncTaskExecutor.execute(SyncTaskExecutor.java:50)
at org.springframework.batch.core.launch.support.SimpleJobLauncher.run(SimpleJobLauncher.java:127)

只有这个 - 之前没有可能触发此异常的异常,然后我的工作失败了。

在搜索前几个小时的日志时,我注意到了这些异常(其中3个在不同的分区步骤中):

06/05/2014 21:50:51.996 [Step3TaskExecutor-12] [] ERROR                   AbstractStep - Line (222) Encountered an error executing the step
org.springframework.retry.RetryException: Non-skippable exception in recoverer while processing; nested exception is java.io.FileNotFoundException: Source 'blabla....pdf' does not exist
    at org.springframework.batch.core.step.item.FaultTolerantChunkProcessor$2.recover(FaultTolerantChunkProcessor.java:281)
    at org.springframework.retry.support.RetryTemplate.handleRetryExhausted(RetryTemplate.java:435)
    at org.springframework.retry.support.RetryTemplate.doExecute(RetryTemplate.java:304)
    at org.springframework.retry.support.RetryTemplate.execute(RetryTemplate.java:188)
    at org.springframework.batch.core.step.item.BatchRetryTemplate.execute(BatchRetryTemplate.java:217)
    at org.springframework.batch.core.step.item.FaultTolerantChunkProcessor.transform(FaultTolerantChunkProcessor.java:290)
    at org.springframework.batch.core.step.item.SimpleChunkProcessor.process(SimpleChunkProcessor.java:192)
    at org.springframework.batch.core.step.item.ChunkOrientedTasklet.execute(ChunkOrientedTasklet.java:75)
    at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:395)
    at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:133)
    at org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(TaskletStep.java:267)
    at org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInIteration(StepContextRepeatCallback.java:77)
    at org.springframework.batch.repeat.support.RepeatTemplate.getNextResult(RepeatTemplate.java:368)
    at org.springframework.batch.repeat.support.RepeatTemplate.executeInternal(RepeatTemplate.java:215)
    at org.springframework.batch.repeat.support.RepeatTemplate.iterate(RepeatTemplate.java:144)
    at org.springframework.batch.core.step.tasklet.TaskletStep.doExecute(TaskletStep.java:253)
    at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:195)
    at org.springframework.batch.core.partition.support.TaskExecutorPartitionHandler$1.call(TaskExecutorPartitionHandler.java:139)
    at org.springframework.batch.core.partition.support.TaskExecutorPartitionHandler$1.call(TaskExecutorPartitionHandler.java:136)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:722)
Caused by: java.io.FileNotFoundException: Source 'blabla....pdf' does not exist

对我来说,似乎很奇怪,在那些例外情况之后,工作继续运行,所以我认为只有发生此异常的奴隶步骤失败并且主步骤等待其余的从属步骤完成为了返回上面提到的第一个错误 有人可以验证这是问题吗?它让我疯狂了好几天

2 个答案:

答案 0 :(得分:1)

这是Spring Batch分区的正确行为。主步骤中的PartitionHandler在所有步骤全部返回(或超时)时立即评估所有步骤的结果。关于奴隶中发生的事情,那些记录的错误将是我的主要原因。但是,最终答案应该在作业存储库中(假设您使用的是数据库支持的实现)。当步骤失败(甚至是分区的从属)时,异常存储在那里。

答案 1 :(得分:0)

当我的 CPU 利用率非常高时,我收到了这个错误。 当我在配置中添加这个 bean 时,它对我有用:

@Bean
    public TaskExecutor asyncTaskExecutor() {
        SimpleAsyncTaskExecutor taskExecutor = new SimpleAsyncTaskExecutor();
        taskExecutor.setConcurrencyLimit(numberOfCores);
        return taskExecutor;
    }

我在这里使用它:

@Bean
    public Step masterStep() throws Exception {
        return stepBuilderFactory.get("masterStep")
                .partitioner(slaveStep().getName(), partitioner())
                .step(slaveStep())
                .gridSize(gridSize)
                .taskExecutor(asyncTaskExecutor())
                .build();
    }