让我们说,如果我们正在滚动一个大索引(例如> 50M)文档,并且在滚动浏览99%的文档后其中一个请求失败,则看起来我们必须重新开始,这太昂贵了,而且感觉效率低下。
对于滚动,我使用的是切片滚动
public void creatSlicedScroll(String index, String type){
ClusterSearchShardsResponse clusterSearchShardsResponse =
searchShard(index,type);
int slices = clusterSearchShardsResponse.getGroups().length;
int scrollSize = 10;
IntStream.range(1, slices).parallel().forEach(i -> {
//prepare search
SliceBuilder sliceBuilder = new SliceBuilder(i, slices);
SearchResponse scrollResponse =
client.prepareSearch(index).setTypes(type)
.setScroll("1m")
.slice(sliceBuilder)
.setSize(scrollSize)
.setFetchSource("id", null).
get();
fetchScroll(scrollResponse);
});
}
public void fetchScroll(SearchResponse scrollResp){
//Scroll until no hits are returned
do {
for (SearchHit hit : scrollResp.getHits().getHits()) {
//Fields Object for each document
Map<String, SearchHitField> responseFields = hit.getFields();
System.out.println(responseFields.toString());
}
scrollResp = client.prepareSearchScroll(
scrollResp.getScrollId())
.setScroll("1m")
.execute()
.actionGet();
} while(scrollResp.getHits().getHits().length != 0); // Zero hits mark the end of the scroll and the while loop.
}
请建议如何处理此失败情况。根据文档,scrollcontext保持活动状态,直到请求时指定的指定时间为止。考虑一下网络故障并且滚动上下文为2M的情况,该程序已读取了大约50%的数据。如何从50%恢复到100%