使用Java RestClient API从Elastic Search处理多个文档

时间:2018-06-26 13:20:59

标签: java json elasticsearch elastic-stack

Am使用Java API从弹性搜索中获取文档。我只能从responseBody正确地获取一个文档。

如果得到多个文档作为答复,我该怎么办。

我以前使用RestHighLevelClient的那个API,我可以借助SearchHit[] searchHits = searchResponse.getHits().getHits();处理多个文档。

使用RestClient API不能做到这一点。,

请找到我下面的代码,该代码能够从弹性搜索中获取文档并将其解析为JSON对象。 (适用于单个文档)

private final static String ATTACHMENT = "document_attachment";
    private final static String TYPE = "doc";
    static long BUFFER_SIZE = 520 * 1024 * 1024;   //  <---- set buffer to 520MB instead of 100MB


    public static void main(String args[])
    {
        RestClient restClient = null;
        Response contentSearchResponse=null;
        String responseBody = null;
        JSONObject source = null;
        String path = null;
        String filename = null;
        int id = 0;
        ResponseHits responseHits = null;

        RestClientBuilder builder =  null; 

        try {

        restClient = RestClient.builder(
                        new HttpHost("localhost", 9200, "http"),
                        new HttpHost("localhost", 9201, "http")).build();

        } catch (Exception e) {
            System.out.println(e.getMessage());
        }

        SearchRequest contentSearchRequest = new SearchRequest(ATTACHMENT); 
        SearchSourceBuilder contentSearchSourceBuilder = new SearchSourceBuilder();
        contentSearchRequest.types(TYPE);
        QueryBuilder attachmentQB = QueryBuilders.matchQuery("attachment.content", "activa");
        contentSearchSourceBuilder.query(attachmentQB);
        contentSearchSourceBuilder.size(50);
        contentSearchRequest.source(contentSearchSourceBuilder);
        System.out.println("Request --->"+contentSearchRequest.toString());

        Map<String, String> params = Collections.emptyMap();
        HttpEntity entity = new NStringEntity(contentSearchSourceBuilder.toString(), ContentType.APPLICATION_JSON);
        HttpAsyncResponseConsumerFactory.HeapBufferedResponseConsumerFactory consumerFactory =
                new HttpAsyncResponseConsumerFactory.HeapBufferedResponseConsumerFactory((int) BUFFER_SIZE);


        try {
            contentSearchResponse = restClient.performRequest("GET", "/document_attachment/doc/_search", params, entity, consumerFactory);
        } catch (IOException e1) {
            e1.printStackTrace();
        } 
        try {
            responseBody = EntityUtils.toString(contentSearchResponse.getEntity());
        } catch (ParseException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
        System.out.println("Converting to JSON");
        JSONObject jsonObject = new JSONObject(responseBody);
        JSONObject  hits = jsonObject.getJSONObject("hits");
        JSONArray hitsArray=hits.getJSONArray("hits");
        for(int i=0;i<hitsArray.length();i++) {
            JSONObject obj= hitsArray.getJSONObject(i);
            source = obj.getJSONObject("_source");
            id = Integer.parseInt(source.opt("id").toString());
            path = source.optString("path");
            filename = source.optString("filename");

        }

        JSONObject jsonBody = new JSONObject();
        jsonBody.put("id", id);
        jsonBody.put("path", path);
        jsonBody.put("filename", filename);
        System.out.println("Response --->"+jsonBody.toString());

        }

2 个答案:

答案 0 :(得分:0)

如果您使用

RestClientBuilder builder = RestClient.builder(
            new HttpHost("localhost", 
            9200, 
            "http"));
RestHighLevelClient restHighLevelClient = new RestHighLevelClient(builder);

您可以这样获取多个结果:

SearchResponse search1 = restHighLevelClient.search(searchRequest);
for (SearchHit hit : searchResponse.getHits()) {
        try {
            Map<String, Object> sourceAsMap = hit.getSourceAsMap();
            JSONObject jo = new JSONObject(hit.getSourceAsMap());
         } catch (JSONException) {
            //TODO do some useful here
            //e.printStackTrace();
         }
}

因此,您可以遍历请求的多个匹配。并且在结果集中不要有Elasticserach相关的输出。

答案 1 :(得分:0)

使用 scroll API。当结果集很大时,这将很有用。

来自文档

  

尽管搜索请求返回的是单个“页面”结果,但滚动API可以用于从单个搜索请求中获取大量结果(甚至是所有结果),方式与使用将光标放在传统数据库上。

相似链接

Elastic Search Scroll Behaviour

Documentation

Parallel Scan & Scroll an Elasticsearch Index