Bulk loading: Making sure all BulkProcessor jobs are completed (Java Client API)

时间:2018-11-13 07:49:05

标签: elasticsearch

I want to make a process that bulk loads data to ES so that

  1. There are two indices: index_1, index_2 and an alias that points to index_1 or index_2
  2. The data is bulk loaded to index_1 or index_2
  3. If all data is loaded without failures, the alias is changed

I'm using the Java Client API.

I would like to be sure that when I add data to BulkProcessor it has completed all jobs before I continue to evaluate if there were any failures. I keep track of failures in BulkProcessor.Listener.afterBulk.

In my current test implementation, when all data is pushed to BulkProcessor, I call BulkProcessor.flush() and then I have added a timeout (just to be sure) before I check if afterBulk has recorded any failures.

But the question is: What can I do to make sure the BulkProcessor doesn't have any jobs left and all pushed IndexRequests have been completed?

1 个答案:

答案 0 :(得分:0)

Java客户端API(v <= 7.0)中没有机制来检查批量队列的大小。您可以自己跟踪添加的ID和已标记为就绪(AfterBulk)的ID。