我使用spring批处理解析文件,我有以下情况:
我正在找工作。这个工作必须解析一个给予文件。出乎意料的原因(比如断电),服务器出现故障,我必须重新启动机器。现在,重新启动服务器后,我想从断电前停止的点恢复作业。这意味着如果系统从10.000读取1.300行,则现在必须从1.301行开始读取。如何使用spring batch实现此方案?
关于配置:我使用spring-integration,它在一个目录下轮询新文件。当文件到达时,spring-integration会创建spring批处理作业。另外,spring-batch使用FlatFileItemReader来解析文件。
答案 0 :(得分:3)
以下是在JVM崩溃后重新启动作业的完整解决方案。
job id =“jobName”xmlns =“http://www.springframework.org/schema/batch” 重新启动的= “真”
2。重启作业的代码
import java.util.Date;
import java.util.List;
import org.apache.commons.collections.CollectionUtils;
import org.springframework.batch.core.BatchStatus;
import org.springframework.batch.core.ExitStatus;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobInstance;
import org.springframework.batch.core.explore.JobExplorer;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.batch.core.launch.JobOperator;
import org.springframework.batch.core.repository.JobRepository;
import org.springframework.beans.factory.annotation.Autowired;
public class ResartJob {
@Autowired
private JobExplorer jobExplorer;
@Autowired
JobRepository jobRepository;
@Autowired
private JobLauncher jobLauncher;
@Autowired
JobOperator jobOperator;
public void restart(){
try {
List<JobInstance> jobInstances = jobExplorer.getJobInstances("jobName",0,1);// this will get one latest job from the database
if(CollectionUtils.isNotEmpty(jobInstances)){
JobInstance jobInstance = jobInstances.get(0);
List<JobExecution> jobExecutions = jobExplorer.getJobExecutions(jobInstance);
if(CollectionUtils.isNotEmpty(jobExecutions)){
for(JobExecution execution: jobExecutions){
// If the job status is STARTED then update the status to FAILED and restart the job using JobOperator.java
if(execution.getStatus().equals(BatchStatus.STARTED)){
execution.setEndTime(new Date());
execution.setStatus(BatchStatus.FAILED);
execution.setExitStatus(ExitStatus.FAILED);
jobRepository.update(execution);
jobOperator.restart(execution.getId());
}
}
}
}
} catch (Exception e1) {
e1.printStackTrace();
}
}
}
3
<bean id="jobRepository" class="org.springframework.batch.core.repository.support.JobRepositoryFactoryBean" p:dataSource-ref="dataSource" p:transactionManager-ref="transactionManager" p:lobHandler-ref="oracleLobHandler"/>
<bean id="oracleLobHandler" class="org.springframework.jdbc.support.lob.DefaultLobHandler"/>
<bean id="jobExplorer" class="org.springframework.batch.core.explore.support.JobExplorerFactoryBean" p:dataSource-ref="dataSource" />
<bean id="jobRegistry" class="org.springframework.batch.core.configuration.support.MapJobRegistry" />
<bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
<property name="jobRepository" ref="jobRepository" />
<property name="taskExecutor" ref="jobLauncherTaskExecutor" />
</bean> <task:executor id="jobLauncherTaskExecutor" pool-size="6" rejection-policy="ABORT" />
<bean id="jobOperator" class="org.springframework.batch.core.launch.support.SimpleJobOperator" p:jobLauncher-ref="jobLauncher" p:jobExplorer-re`enter code here`f="jobExplorer" p:jobRepository-ref="jobRepository" p:jobRegistry-ref="jobRegistry"/>
答案 1 :(得分:1)
我遵循了此link 中描述的解决方案,它似乎对我有用!
答案 2 :(得分:1)
Spring批处理4的更新解决方案。将JVM启动时间考虑在内,以检测损坏的作业。请注意,在多个服务器启动作业的群集环境中,这将不起作用。
@Bean
public ApplicationListener<ContextRefreshedEvent> resumeJobsListener(JobOperator jobOperator, JobRepository jobRepository,
JobExplorer jobExplorer) {
// restart jobs that failed due to
return event -> {
Date jvmStartTime = new Date(ManagementFactory.getRuntimeMXBean().getStartTime());
// for each job
for (String jobName : jobExplorer.getJobNames()) {
// get latest job instance
for (JobInstance instance : jobExplorer.getJobInstances(jobName, 0, 1)) {
// for each of the executions
for (JobExecution execution : jobExplorer.getJobExecutions(instance)) {
if (execution.getStatus().equals(BatchStatus.STARTED) && execution.getCreateTime().before(jvmStartTime)) {
// this job is broken and must be restarted
execution.setEndTime(new Date());
execution.setStatus(BatchStatus.FAILED);
execution.setExitStatus(ExitStatus.FAILED);
for (StepExecution se : execution.getStepExecutions()) {
if (se.getStatus().equals(BatchStatus.STARTED)) {
se.setEndTime(new Date());
se.setStatus(BatchStatus.FAILED);
se.setExitStatus(ExitStatus.FAILED);
jobRepository.update(se);
}
}
jobRepository.update(execution);
try {
jobOperator.restart(execution.getId());
}
catch (JobExecutionException e) {
LOG.warn("Couldn't resume job execution {}", execution, e);
}
}
}
}
}
};
}
答案 3 :(得分:0)
在您的情况下,我要做的是创建一个步骤来记录文件中最后处理的行。然后创建第二个作业,该作业将读取此文件并从特定行号开始处理。
因此,如果作业因任何原因而停止,您将能够运行将继续处理的新作业。
答案 4 :(得分:0)
您还可以像下面这样写:
@RequestMapping(value = "/updateStatusAndRestart/{jobId}/{stepId}", method = GET)
public ResponseEntity<String> updateBatchStatus(@PathVariable("jobId") Long jobExecutionId ,@PathVariable("stepId")Long stepExecutionId )throws Exception {
StepExecution stepExecution = jobExplorer.getStepExecution(jobExecutionId,stepExecutionId);
stepExecution.setEndTime(new Date(System.currentTimeMillis()));
stepExecution.setStatus(BatchStatus.FAILED);
stepExecution.setExitStatus(ExitStatus.FAILED);
jobRepository.update(stepExecution);
JobExecution jobExecution = stepExecution.getJobExecution();
jobExecution.setEndTime(new Date(System.currentTimeMillis()));
jobExecution.setStatus(BatchStatus.FAILED);
jobExecution.setExitStatus(ExitStatus.FAILED);
jobRepository.update(jobExecution);
jobOperator.restart(execution.getId());
return new ResponseEntity<String>("<h1> Batch Status Updated !! </h1>", HttpStatus.OK);
}
在这里,我已使用restApi端点传递jobExecutionId和stepExecutionId,并将job_execution和step_execution的状态都设置为FAIL。然后使用批处理运算符重新启动。