Question

我有一个50K记录的文件。将它插入数据库需要将近40分钟。所以我考虑将一个分区应用于这个步骤，使得50k记录在10个线程之间进行分区（通过gridSize），每个线程并行处理1000个记录。

所有论坛都显示了使用JDBCPagingItemReader和通过执行上下文设置分区计数的示例。由于我使用MultiResourceItemReader，如何为startingIndex设置分区计数（endingIndex和MultiResourceItemReader - 请参阅下面的代码段？）

请告知。

以下分区程序的代码段：

public Map partition(int gridSize) {
    LOGGER.debug("START: Partition");
    Map partitionMap = new HashMap();
    int startingIndex = 0;
    int endingIndex =  1000;

    for(int i=0; i< gridSize; i++){
        ExecutionContext ctxMap = new ExecutionContext();
        ctxMap.putInt("startingIndex",startingIndex);
        ctxMap.putInt("endingIndex", endingIndex);

        startingIndex = endingIndex+1;
        endingIndex += 1000; 

        partitionMap.put("Thread:-"+i, ctxMap);
    }
    LOGGER.debug("END: Created Partitions of size: "+ partitionMap.size());
    return partitionMap;
}

Answer 1

您不必在MultiResourceItemReader上设置分区计数。您使用MultiResourcePartitioner为每个资源（文件）创建一个分区，然后让读者分别选取每个文件作为它自己的分区。使用该配置，您不再需要MultiResourceItemReader（您可以直接转到代表处）。

Spring Batch样本中有一个这个用例的示例，可以在这里找到：https://github.com/spring-projects/spring-batch/blob/master/spring-batch-samples/src/main/resources/jobs/partitionFileJob.xml

如何为MultiResourceItemReader应用分区计数？

1 个答案: