对于Spring批处理作业,我们在同一个表上有两个不同的查询。要求是让读者执行两个查询以从同一个表中读取数据。
一种方法可能是:
<batch:step id="firstStep" next="secondStep">
<batch:tasklet>
<batch:chunk reader="firstReader" writer="firstWriter" commit- interval="2">
</batch:chunk>
</batch:tasklet>
</batch:step>
<batch:step id="secondStep" next="thirdStep">
<batch:tasklet>
<batch:chunk reader="secondReader" writer="secondWriter"
commit-interval="2">
</batch:chunk>
</batch:tasklet>
</batch:step>
但这要求完全另一步定义,这是第一步的副本。有没有其他方法可以达到同样的目的?我正在为基于数据库的读者寻找类似MultiResourceItemReader的东西,它们将数据聚合在一起。
答案 0 :(得分:3)
你可以在数据库中为不同的查询创建一个视图,并在你调用JdbcPagingItemReader时调用它。如果那不是一个选项那么有不同的方法,但我工作的一种方式如下所示.Spring有其他选项作为好吧,但按照开发人员的观点,以下是绝对的选择。
创建两个项目阅读器...第一个位于
之下<!--use org.springframework.batch.item.database.JdbcCursorItemReader for simple queries-->
<bean id="itemReader1"
class="org.springframework.batch.item.database.JdbcPagingItemReader"
<property name="sql"
value=" FROM table1" />
.......
<property name="rowMapper">
<bean class="com.sjena.AccountApplicationMapper" />
</property>
</bean>
然后是表2中的另一位读者
<bean id="itemReader2"
class="org.springframework.batch.item.database.JdbcCursorItemReader"
<property name="sql"
value="FROM table2" />
.......
<property name="rowMapper">
<bean class="com.sjena.AccountApplicationMapper" />
</property>
</bean>
然后委托您的自定义阅读器
<bean id="customItemReader" class="com.sjena.spring.reader.MyCustomReader"
scope="step">
<property name="itemReader1" ref="itemReader1" />
<property name="itemReader2" ref="itemReader2" />
<property name="pageSize" value="5" />
</bean>
并最终使用此自定义阅读器
<job id="testJob" xmlns="http://www.springframework.org/schema/batch">
<step id="step1">
<tasklet>
<chunk reader="itemReader" writer="itemWriter"
commit-interval="1" />
</tasklet>
</step>
</job>
然后你的课程如下:
public class MyCustomReader implements ItemReader<AccountApplicationSummary> {
int pagesize;// you may have diff pagesize for diff item readers
ItemReader<AccountApplication> itemReader1;
ItemReader<AccountApplication> itemReader2;
@Override
public AccountApplicationSummary read()
throws Exception, UnexpectedInputException, ParseException, NonTransientResourceException {
// itemReader1.setPageSize(pageSize),Be sure, itemReader is JdbcPagingItemReader type and better to do these initiatlization in a init method (implement InitializingBean and use afterpropertyset to set them..)..
//Like pageSize, you can set anyproperty that you may need
AccountApplication application1 = itemReader1.read();
AccountApplication application2 = itemReader2.read();
//And you have results from both tables and now you can play with it
AccountApplicationSummary summary = new AccountApplicationSummary();
return summary;
}
}
答案 1 :(得分:0)
这个答案是对Hansjoerg提出的关于多次执行步骤的类似问题的答案的改编:Spring Batch - Looping a reader/processor/writer step
package hello;
import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;
import javax.sql.DataSource;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.core.job.builder.FlowBuilder;
import org.springframework.batch.core.job.flow.Flow;
import org.springframework.batch.core.job.flow.support.SimpleFlow;
import org.springframework.batch.item.database.JdbcCursorItemReader;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.task.SimpleAsyncTaskExecutor;
import org.springframework.jdbc.core.BeanPropertyRowMapper;
@Configuration
@EnableBatchProcessing
public class BatchConfiguration {
@Autowired
public JobBuilderFactory jobBuilderFactory;
@Autowired
public StepBuilderFactory stepBuilderFactory;
List<String> queries = Arrays.asList("some query1, "some query2");
@Bean
public Job multiQueryJob() {
List<Step> steps = queries.stream().map(query -> createStep(query)).collect(Collectors.toList());
return jobBuilderFactory.get("multiQueryJob")
.start(createParallelFlow(steps))
.end()
.build();
}
private Step createStep(String query) {
return stepBuilderFactory.get("convertStepFor" + query)
.chunk(10)
.reader(createQueryReader(query))
.writer(dummyWriter())
.build();
}
private Flow createParallelFlow(List<Step> steps) {
SimpleAsyncTaskExecutor taskExecutor = new SimpleAsyncTaskExecutor();
taskExecutor.setConcurrencyLimit(1); // force sequential execution
List<Flow> flows = steps.stream()
.map(step -> new FlowBuilder<Flow>("flow_" + step.getName())
.start(step)
.build())
.collect(Collectors.toList());
return new FlowBuilder<SimpleFlow>("parallelStepsFlow")
.split(taskExecutor)
.add(flows.toArray(new Flow[flows.size()])).build();
}
public JdbcCursorItemReader<Actor> createQueryReader(String query) {
JdbcCursorItemReader<Actor> reader = new JdbcCursorItemReader<>();
reader.setDataSource(dataSource());
reader.setSql(query);
reader.setRowMapper(mapper());
return reader;
}
public BeanPropertyRowMapper<Actor> mapper(){
BeanPropertyRowMapper<Actor> mapper = new BeanPropertyRowMapper<>();
mapper.setMappedClass(Actor.class);
return mapper;
}
public DummyItemWriter dummyWriter() {
return new DummyItemWriter();
}
public DataSource dataSource() {
final SimpleDriverDataSource dataSource = new SimpleDriverDataSource();
try {
dataSource.setDriver(new com.mysql.jdbc.Driver());
} catch (SQLException e) {
e.printStackTrace();
}
dataSource.setUrl("jdbc:mysql://localhost:3306/SAKILA");
dataSource.setUsername("sa");
dataSource.setPassword("password");
return dataSource;
}
}
我在查询列表中提供了两个虚拟查询,您必须提供实际查询。该作业将根据查询量构建,在本例中,我使用Spring Batch JdbcCursorItemReader从数据库中读取数据。
您可以使用Spring https://spring.io/guides/gs/batch-processing/提供的示例并添加Actor POJO来重新创建此配置,最后但并非最不重要的是,删除您不需要的类(您只需要BatchConfiguration和Application类)。