中间故障后恢复滚动请求

时间:2018-12-16 16:47:04

标签: java elasticsearch elasticsearch-5

让我们说,如果我们正在滚动一个大索引(例如> 50M)文档,并且在滚动浏览99%的文档后其中一个请求失败,则看起来我们必须重新开始,这太昂贵了,而且感觉效率低下。

对于滚动,我使用的是切片滚动

public void creatSlicedScroll(String index, String type){
    ClusterSearchShardsResponse clusterSearchShardsResponse = 
    searchShard(index,type);
    int slices = clusterSearchShardsResponse.getGroups().length;
    int scrollSize = 10;

    IntStream.range(1, slices).parallel().forEach(i -> {
        //prepare search
        SliceBuilder sliceBuilder = new SliceBuilder(i, slices);
        SearchResponse scrollResponse = 
        client.prepareSearch(index).setTypes(type)
                .setScroll("1m")
                .slice(sliceBuilder)
                .setSize(scrollSize)
                .setFetchSource("id", null).
                get();
       fetchScroll(scrollResponse);
    });
}




public void fetchScroll(SearchResponse scrollResp){
    //Scroll until no hits are returned
    do {
        for (SearchHit hit : scrollResp.getHits().getHits()) {
            //Fields Object for each document
            Map<String, SearchHitField> responseFields = hit.getFields();
            System.out.println(responseFields.toString());
        }
        scrollResp = client.prepareSearchScroll(
                scrollResp.getScrollId())
                .setScroll("1m")
                .execute()
                .actionGet();
    } while(scrollResp.getHits().getHits().length != 0); // Zero hits mark the end of the scroll and the while loop.
}

请建议如何处理此失败情况。根据文档,scrollcontext保持活动状态,直到请求时指定的指定时间为止。考虑一下网络故障并且滚动上下文为2M的情况,该程序已读取了大约50%的数据。如何从50%恢复到100%

0 个答案:

没有答案