我创建了一个POC项目,我在其中使用Spring批量本地分区步骤将Employee表10记录移动到NewEmployee表。我已配置4个线程来运行此批处理。 当我运行这个批处理过程时,我可以看到slave步骤没有调用pagingItemReader()方法。因为没有调用这个OraclePagingQueryProvider。 我注意到错过的数字记录(未移动)等于配置的线程数。 我已经开发了这个POC,并从以下链接获取指导: - https://github.com/mminella/LearningSpringBatch/tree/master/src/localPartitioning
请注意,当我用正常的读取,处理和写入逻辑替换主代码和从代码时,下面的代码工作正常,其中不涉及多线程。
DB中的BATCH_STEP_EXECUTION表也表示仅移动了8条记录(这里再次错过了2条记录,这等于线程数)。 DB Record说如下: -
STEP_NAME状态COMMIT_COUNT READ_COUNT个WRITE_COUNT EXIT_CODE slaveStep:partition1已完成1 4 4已完成 slaveStep:partition0已完成1 4 4已完成 masterStep已完成2 8 8已完成
配置类的代码片段
@Bean
public JobRegistryBeanPostProcessor jobRegistrar() throws Exception{
JobRegistryBeanPostProcessor registrar=new JobRegistryBeanPostProcessor();
registrar.setJobRegistry(this.jobRegistry);
registrar.setBeanFactory(this.applicationContext.getAutowireCapableBeanFactory());
registrar.afterPropertiesSet();
return registrar;
}
@Bean
public JobOperator jobOperator() throws Exception{
SimpleJobOperator simpleJobOperator=new SimpleJobOperator();
simpleJobOperator.setJobLauncher(this.jobLauncher);
simpleJobOperator.setJobParametersConverter(new DefaultJobParametersConverter());
simpleJobOperator.setJobRepository(this.jobRepository);
simpleJobOperator.setJobExplorer(this.jobExplorer);
simpleJobOperator.setJobRegistry(this.jobRegistry);
simpleJobOperator.afterPropertiesSet();
return simpleJobOperator;
}
@Bean
public ColumnRangePartitioner partitioner() {
ColumnRangePartitioner partitioner = new ColumnRangePartitioner();
partitioner.setColumn("id");
partitioner.setDataSource(this.dataSource);
partitioner.setTable("Employee");
LOGGER.info("partitioner---->"+partitioner);
return partitioner;
}
@Bean
public Step masterStep() {
return stepBuilderFactory.get("masterStep")
.partitioner(slaveStep().getName(), partitioner())
.step(slaveStep())
.gridSize(gridSize)
.taskExecutor(taskExecutorConfiguration.taskExecutor())
.build();
}
@Bean
public Step slaveStep() {
return stepBuilderFactory.get("slaveStep")
.<Employee, NewEmployee>chunk(chunkSize)
.reader(pagingItemReader(null,null))
.processor(employeeProcessor())
.writer(employeeWriter.customItemWriter())
.build();
}
@Bean
public Job job() {
return jobBuilderFactory.get("FR")
.start(masterStep())
.build();
}
@Bean
public ItemProcessor<Employee, NewEmployee> employeeProcessor() {
return new EmployeeProcessor();
}
@Override
public void setApplicationContext(ApplicationContext applicationContext) throws BeansException {
this.applicationContext=applicationContext;
}
*/
@Bean
@StepScope
public JdbcPagingItemReader<Employee> pagingItemReader(@Value("#{stepExecutionContext['minValue']}") Long minvalue,
@Value("#{stepExecutionContext['maxValue']}") Long maxvalue) {
JdbcPagingItemReader<Employee> reader = new JdbcPagingItemReader<Employee>();
reader.setDataSource(this.dataSource);
// this should be equal to chunk size for the performance reasons.
reader.setFetchSize(chunkSize);
reader.setRowMapper((resultSet, i) -> {
return new Employee(resultSet.getLong("id"),
resultSet.getString("firstName"),
resultSet.getString("lastName"));
});
OraclePagingQueryProvider provider = new OraclePagingQueryProvider();
provider.setSelectClause("id, firstName, lastName");
provider.setFromClause("from Employee");
LOGGER.info("min-->"+minvalue);
LOGGER.info("max-->"+maxvalue);
provider.setWhereClause("where id<=" + minvalue + " and id > " + maxvalue);
Map<String, Order> sortKeys = new HashMap<>(1);
sortKeys.put("id", Order.ASCENDING);
provider.setSortKeys(sortKeys);
reader.setQueryProvider(provider);
LOGGER.info("reader--->"+reader);
return reader;
}
@Override
public Map<String, ExecutionContext> partition(int gridSize) {
int min = jdbcTemplate.queryForObject("SELECT MIN(" + column + ") from " + table, Integer.class);
int max = jdbcTemplate.queryForObject("SELECT MAX(" + column + ") from " + table, Integer.class);
int targetSize = (max - min) / gridSize + 1;
Map<String, ExecutionContext> result = new HashMap<String, ExecutionContext>();
int number = 0;
int start = min;
int end = start + targetSize - 1;
while (start <= max) {
ExecutionContext value = new ExecutionContext();
result.put("partition" + number, value);
if (end >= max) {
end = max;
}
LOGGER.info("Start-->" + start);
LOGGER.info("end-->" + end);
value.putInt("minValue", start);
value.putInt("maxValue", end);
start += targetSize;
end += targetSize;
number++;
}
return result;
}
ColumnRangePartitioner类的代码片段: -
int min = jdbcTemplate.queryForObject("SELECT MIN(" + column + ") from " + table, Integer.class);
int max = jdbcTemplate.queryForObject("SELECT MAX(" + column + ") from " + table, Integer.class);
int targetSize = (max - min) / gridSize + 1;
Map<String, ExecutionContext> result = new HashMap<String, ExecutionContext>();
int number = 0;
int start = min;
int end = start + targetSize - 1;
while (start <= max) {
ExecutionContext value = new ExecutionContext();
result.put("partition" + number, value);
if (end >= max) {
end = max;
}
LOGGER.info("Start-->" + start);
LOGGER.info("end-->" + end);
value.putInt("minValue", start);
value.putInt("maxValue", end);
start += targetSize;
end += targetSize;
number++;
}
return result;
答案 0 :(得分:0)
我找到了这个问题的解决方案。我们必须在分区器之后的masterStep中添加partitionHandler。在partitionHandler中,我们定义slaveStep和其他配置。以下是代码段。
MasterStep: - 在这里添加partitionHandler代码,
stepBuilderFactory
.get("userMasterStep")
.partitioner(userSlaveStep().getName(), userPartitioner())
.partitionHandler(userMasterSlaveHandler())
.build();
定义另一个名为partitionHandler的bean并在此处调用slave步骤
@Bean
public PartitionHandler userMasterSlaveHandler() throws Exception {
TaskExecutorPartitionHandler handler = new TaskExecutorPartitionHandler();
handler.setGridSize(gridSize);
handler.setTaskExecutor(taskExecutorConfiguration.taskExecutor());
handler.setStep(userSlaveStep());
handler.afterPropertiesSet();
return handler;
}