Spring批处理:分区读取器再次调用

时间:2016-06-13 14:47:40

标签: java spring multithreading spring-batch

我必须将一些数据从数据库迁移到另一个,我正在使用Spring Batch with partition。作业的配置如下

...
...
<bean id="migrationProcessor" class="it.migrazione.MigrazioneProcessor" scope="step"/>
<bean id="migrationWriter" class="it.migrazione.MigrazioneWriter" scope="step"/>
<bean id="migrationReader" class="it.migrazione.MigrazioneReader" scope="step"/>

<bean id="partitioner" class="it.migrazione.MigrazionePartitioner" />

<bean id="taskExecutor" class="org.springframework.core.task.SimpleAsyncTaskExecutor"/>

<bean id="threadPoolTaskExecutor" class="org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor">
    <property name="corePoolSize" value="10" />
    <property name="maxPoolSize" value="10" />
    <property name="allowCoreThreadTimeOut" value="true" />
</bean>

<job id="migrationJob" xmlns="http://www.springframework.org/schema/batch">
    <step id="masterStep">
    <partition step="slave" partitioner="partitioner">
        <handler grid-size="10" task-executor="threadPoolTaskExecutor" />
    </partition>
    </step>
</job>

<step id="slave" xmlns="http://www.springframework.org/schema/batch">
    <tasklet throttle-limit="1" transaction-manager="transactionManager">
        <chunk reader="migrationReader" 
               processor="migrationProcessor"
               writer="migrationWriter" 
               commit-interval="1"/>
    </tasklet>
</step>

<bean id="transactionManager" class="org.springframework.orm.jpa.JpaTransactionManager">
    <property name="entityManagerFactory" ref="entityManagerFactory"/>
</bean>
...
...

Job知道必须迁移多少行,因此分区程序会创建10个与特定范围相关的上下文

for (int threadCount = 1; threadCount <= gridSize; threadCount++) {
      if (threadCount == 1)
        fromRow = 0;
      else
        fromRow = toRow + 1;
      toRow += delta;

      context = new ExecutionContext();
      context.putInt("fromRow", fromRow);
      context.putInt("toRow", toRow);
      context.putString("name", "Processing Thread" + threadCount);
      result.put("partition" + threadCount, context);

      logger.info("Partition number >> " + threadCount + " from Row#: "
              + fromRow + " to Row#: " + toRow);
}   

当我运行这个工作时,我有一些线程可以读取另一个时间。例如,线程#1再次调用读取器,处理器和写入器。我不明白为什么,但是有可能有一个线程执行一次块,而不检查是否已经调用了读取?当与特定分区相关的编写器结束时,为什么线程会再次调用读取器?就像读者没有立即看到作者所做的改变一样。

0 个答案:

没有答案