Question

我有一个需求，我需要在ItemProcessor部分中查找一些表。我不想对ItemProcessor部分中的每一行进行多个JDBC调用，这在Spring批处理开始处理更多记录时可能会导致性能问题。有什么变通办法可以避免这种情况？有什么方法可以在ItemProcessor之前或批处理开始之前预加载这些对象，并可以在ItemProcessor中引用它？

Answer 1

可以在Spring应用程序上下文初始化期间使用@PostConstruct注释方法来读取数据。使ItemReader的read方法从列表中返回值。整个列表完成后，返回null。这停止阅读。

@Service
public class YourItemReader implements ItemReader<DomainObject> {

 private int index;

 List<DomainObject> dbRows;

 @PostConstruct
 public void init() {
   List<DomainObject> //read from database
 }


@Override
public DomainObject read(){
        if (null != dbRows && index < dbRows.size()) {
         return dbRows.get(index);
       }
   return null;
}

如果记录数以百万计，我建议您从数据库中进行基于块的读取，而不是一次读取所有记录，这可能会导致垃圾回收器出现内存不足异常。通过在表中添加一个名为STATUS的列来跟踪已处理记录的状态，可以轻松完成此操作。最初，当您将数据加载到表中时，将状态设置为“未处理”，当ItemReader读取记录块时，将状态设置为“进行中”。一旦您的ItemProcessor或ItemWriter完成处理，将状态从“正在进行中”更改为“已处理”。确保将从数据库中获取数据的方法设置为“同步”。这样可以确保多个线程不会从数据库中获取相同的数据。

public List<DomainObject> read(){
 return fetchDataFromDb();
}

private synchronized List<DomainObject> fetchProductAssociationData(){
//read your chunk-size of records from database which has status as 'NOT 
PROCESSED' 
 and change the status of the data which is read to 'IN PROGRESS'
return list;
}

Spring Batch：我们如何从数据库中预加载值，并将其用于处理器部分

1 个答案: