Question

让我们说，如果我们正在滚动一个大索引（例如> 50M）文档，并且在滚动浏览99％的文档后其中一个请求失败，则看起来我们必须重新开始，这太昂贵了，而且感觉效率低下。

对于滚动，我使用的是切片滚动

public void creatSlicedScroll(String index, String type){
    ClusterSearchShardsResponse clusterSearchShardsResponse = 
    searchShard(index,type);
    int slices = clusterSearchShardsResponse.getGroups().length;
    int scrollSize = 10;

    IntStream.range(1, slices).parallel().forEach(i -> {
        //prepare search
        SliceBuilder sliceBuilder = new SliceBuilder(i, slices);
        SearchResponse scrollResponse = 
        client.prepareSearch(index).setTypes(type)
                .setScroll("1m")
                .slice(sliceBuilder)
                .setSize(scrollSize)
                .setFetchSource("id", null).
                get();
       fetchScroll(scrollResponse);
    });
}




public void fetchScroll(SearchResponse scrollResp){
    //Scroll until no hits are returned
    do {
        for (SearchHit hit : scrollResp.getHits().getHits()) {
            //Fields Object for each document
            Map<String, SearchHitField> responseFields = hit.getFields();
            System.out.println(responseFields.toString());
        }
        scrollResp = client.prepareSearchScroll(
                scrollResp.getScrollId())
                .setScroll("1m")
                .execute()
                .actionGet();
    } while(scrollResp.getHits().getHits().length != 0); // Zero hits mark the end of the scroll and the while loop.
}

请建议如何处理此失败情况。根据文档，scrollcontext保持活动状态，直到请求时指定的指定时间为止。考虑一下网络故障并且滚动上下文为2M的情况，该程序已读取了大约50％的数据。如何从50％恢复到100％

中间故障后恢复滚动请求

0 个答案: