Question

对于Spring批处理作业，我们在同一个表上有两个不同的查询。要求是让读者执行两个查询以从同一个表中读取数据。

一种方法可能是：

<batch:step id="firstStep" next="secondStep">
       <batch:tasklet>
          <batch:chunk reader="firstReader" writer="firstWriter" commit-        interval="2">
          </batch:chunk>
       </batch:tasklet>
    </batch:step>
    <batch:step id="secondStep" next="thirdStep">
       <batch:tasklet>
          <batch:chunk reader="secondReader" writer="secondWriter"
           commit-interval="2">
          </batch:chunk>
       </batch:tasklet>
    </batch:step>

但这要求完全另一步定义，这是第一步的副本。有没有其他方法可以达到同样的目的？我正在为基于数据库的读者寻找类似MultiResourceItemReader的东西，它们将数据聚合在一起。

Answer 1

你可以在数据库中为不同的查询创建一个视图，并在你调用JdbcPagingItemReader时调用它。如果那不是一个选项那么有不同的方法，但我工作的一种方式如下所示.Spring有其他选项作为好吧，但按照开发人员的观点，以下是绝对的选择。

创建两个项目阅读器...第一个位于

之下

<!--use org.springframework.batch.item.database.JdbcCursorItemReader for  simple queries-->
<bean id="itemReader1"
    class="org.springframework.batch.item.database.JdbcPagingItemReader"
 <property name="sql"
    value=" FROM   table1" />
    .......
    <property name="rowMapper">
        <bean class="com.sjena.AccountApplicationMapper" />
    </property>
</bean>

然后是表2中的另一位读者

<bean id="itemReader2"
    class="org.springframework.batch.item.database.JdbcCursorItemReader"
<property name="sql"
    value="FROM   table2" />
    .......
    <property name="rowMapper">
        <bean class="com.sjena.AccountApplicationMapper" />
    </property>
</bean>

然后委托您的自定义阅读器

<bean id="customItemReader" class="com.sjena.spring.reader.MyCustomReader"
    scope="step">
    <property name="itemReader1" ref="itemReader1" />
    <property name="itemReader2" ref="itemReader2" />
    <property name="pageSize" value="5" />

</bean>

并最终使用此自定义阅读器

<job id="testJob" xmlns="http://www.springframework.org/schema/batch">
    <step id="step1">
        <tasklet>
            <chunk reader="itemReader" writer="itemWriter"
                commit-interval="1" />
        </tasklet>
    </step>
</job>

然后你的课程如下：

public class MyCustomReader implements ItemReader<AccountApplicationSummary> {

int pagesize;// you may have diff pagesize for diff item readers
ItemReader<AccountApplication>  itemReader1;
ItemReader<AccountApplication>  itemReader2;


@Override
public AccountApplicationSummary read()
        throws Exception, UnexpectedInputException, ParseException, NonTransientResourceException {

    // itemReader1.setPageSize(pageSize),Be sure, itemReader is   JdbcPagingItemReader type and better to do these initiatlization in a init method (implement InitializingBean and use afterpropertyset to set them..).. 
    //Like pageSize, you can set anyproperty that you may need

    AccountApplication application1 = itemReader1.read();
    AccountApplication application2 = itemReader2.read();
    //And you have results from both tables and now you can play with it 

    AccountApplicationSummary summary = new AccountApplicationSummary();

    return summary;
}

}

Answer 2

这个答案是对Hansjoerg提出的关于多次执行步骤的类似问题的答案的改编：Spring Batch - Looping a reader/processor/writer step

package hello;

import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;

import javax.sql.DataSource;

import org.springframework.batch.core.Job;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.core.job.builder.FlowBuilder;
import org.springframework.batch.core.job.flow.Flow;
import org.springframework.batch.core.job.flow.support.SimpleFlow;
import org.springframework.batch.item.database.JdbcCursorItemReader;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.task.SimpleAsyncTaskExecutor;
import org.springframework.jdbc.core.BeanPropertyRowMapper;

@Configuration
@EnableBatchProcessing
public class BatchConfiguration {

    @Autowired
    public JobBuilderFactory jobBuilderFactory;

    @Autowired
    public StepBuilderFactory stepBuilderFactory;

    List<String> queries = Arrays.asList("some query1, "some query2");

    @Bean
    public Job multiQueryJob() {

        List<Step> steps = queries.stream().map(query -> createStep(query)).collect(Collectors.toList());       

        return jobBuilderFactory.get("multiQueryJob")
                .start(createParallelFlow(steps))
                .end()
                .build();
    }

    private Step createStep(String query) {

        return stepBuilderFactory.get("convertStepFor" + query)
                .chunk(10)
                .reader(createQueryReader(query))
                .writer(dummyWriter())
                .build();
    }

    private Flow createParallelFlow(List<Step> steps) {
        SimpleAsyncTaskExecutor taskExecutor = new SimpleAsyncTaskExecutor();
        taskExecutor.setConcurrencyLimit(1); // force sequential execution

        List<Flow> flows = steps.stream()

                .map(step -> new FlowBuilder<Flow>("flow_" + step.getName())
                        .start(step) 
                        .build()) 
                .collect(Collectors.toList());

        return new FlowBuilder<SimpleFlow>("parallelStepsFlow")
                .split(taskExecutor)
                .add(flows.toArray(new Flow[flows.size()])).build();
    }

    public JdbcCursorItemReader<Actor> createQueryReader(String query) {
        JdbcCursorItemReader<Actor> reader = new JdbcCursorItemReader<>();
        reader.setDataSource(dataSource());
        reader.setSql(query);
        reader.setRowMapper(mapper());
        return reader;
    }

    public BeanPropertyRowMapper<Actor> mapper(){
        BeanPropertyRowMapper<Actor> mapper = new BeanPropertyRowMapper<>();
        mapper.setMappedClass(Actor.class);
        return mapper;
    }

    public DummyItemWriter dummyWriter() {
        return new DummyItemWriter();
    }

    public DataSource dataSource() {
        final SimpleDriverDataSource dataSource = new SimpleDriverDataSource();
        try {
            dataSource.setDriver(new com.mysql.jdbc.Driver());
        } catch (SQLException e) {
            e.printStackTrace();
        }
        dataSource.setUrl("jdbc:mysql://localhost:3306/SAKILA");
        dataSource.setUsername("sa");
        dataSource.setPassword("password");
        return dataSource;
    }

}

我在查询列表中提供了两个虚拟查询，您必须提供实际查询。该作业将根据查询量构建，在本例中，我使用Spring Batch JdbcCursorItemReader从数据库中读取数据。

您可以使用Spring https://spring.io/guides/gs/batch-processing/提供的示例并添加Actor POJO来重新创建此配置，最后但并非最不重要的是，删除您不需要的类（您只需要BatchConfiguration和Application类）。

如何实现多个查询但相同输出项的阅读器？

2 个答案: