Question

我有一张超过100万客户的桌子。每个客户的信息都会经常更新，但每天只会更新一次。我有一个Spring批处理工作

从客户表（JdbcCursorItemReader）
处理客户信息（ItemProcessor）
写入customer表（ItemWriter）

我想一次运行10个作业，这些作业将从一个Customer表中读取，而不会两次读取客户。这是否可以使用Spring批处理，或者这是我必须使用crawlLog表在数据库级别处理的内容，如本文所述？

How do I lock read/write to MySQL tables so that I can select and then insert without other programs reading/writing to the database?

我知道参数可以传递给作业。我可以阅读所有客户ID并将客户ID平均分配给10个工作。但这是正确的做法吗？

Answer 1

框架有几种方法来指定你想要的东西，这取决于你得到了什么。更简单的方法就是在步骤或流程中添加一个任务执行器：

<step id="copy">
  <tasklet task-executor="taskExecutor" throttle-limit="10">
  ...
  </tasklet>
</step>

<beans:bean id="taskExecutor"
  class="org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor">
  <property name="corePoolSize" value="10"/>
  <property name="maxPoolSize" value="15"/>
</beans:bean>

您可能希望了解官方Spring Batch documentation中关于可伸缩性的此技术和其他技术。

如何在Spring批处理中运行并发作业而不重叠数据读取

1 个答案: