如何优化Elasticsearch查询?

时间:2018-10-17 10:48:19

标签: java elasticsearch

我正在尝试使用Java高级REST客户端执行Elasticsearch查询。主要目标是为我分组结果。 这是数据:

    "hits" : [
  {
    "_index" : "my_index",
    "_type" : "object",
    "_id" : "X4sSPmYB62YwufswHQbx",
    "_score" : 1.0,
    "_source" : {
      "objId" : "1",
      "stepId" : "step_one",
      "status" : "RUNNING",
      "timestamp" : 1515974400
    }
  },
  {
    "_index" : "my_index",
    "_type" : "object",
    "_id" : "15QRP2YB62YwufswAApl",
    "_score" : 1.0,
    "_source" : {
      "objId" : "1",
      "stepId" : "step_one",
      "status" : "DONE",
      "timestamp" : 1516406400
    }
  },
  {
    "_index" : "my_index",
    "_type" : "object",
    "_id" : "QpMOP2YB62YwufswrfYn",
    "_score" : 1.0,
    "_source" : {
      "objId" : "1",
      "stepId" : "step_two",
      "status" : "RUNNING",
      "timestamp" : 1516492800
    }
  },
  {
    "_index" : "my_index",
    "_type" : "object",
    "_id" : "VZMPP2YB62YwufswJ_r0",
    "_score" : 1.0,
    "_source" : {
      "objId" : "1",
      "stepId" : "step_two",
      "status" : "DONE",
      "timestamp" : 1517356800
    }
  },
  {
    "_index" : "my_index",
    "_type" : "object",
    "_id" : "XZMPP2YB62YwufswQfrc",
    "_score" : 1.0,
    "_source" : {
      "objId" : "2",
      "stepId" : "step_one",
      "status" : "DONE",
      "timestamp" : 1517788800
    }
  }
  }
]

例如对于objId = 1,我希望检索类似的内容:

    {
      "objId" : "1",
      "stepId" : "step_one",
      "status" : "DONE",
      "timestamp" : 1516406400
    },
    {
      "objId" : "1",
      "stepId" : "step_two",
      "status" : "DONE",
      "timestamp" : 1517356800
    }

现在我有了这个Java方法:

    private List<MyObject> search(String objId) {
    MatchPhraseQueryBuilder queryBuilder = QueryBuilders.matchPhraseQuery("objId", objId);
    SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
    searchSourceBuilder.query(queryBuilder);
    searchSourceBuilder.size(1000);

    SearchRequest searchRequest = new SearchRequest("my_index");
    searchRequest.types("object");
    searchRequest.source(searchSourceBuilder);

    try {
        SearchResponse searchResponse = restHighLevelClient.search(searchRequest);

        return Arrays.stream(searchResponse.getHits().getHits())
                .map(this::toMyObject)
                .collect(toList());
    } catch (IOException ex) {
        log.error("Error retrieving records from elasticsearch. {} ", ex);
    }

    return new ArrayList<>();
}

但是此方法仅返回objId找到的对象列表。

我的问题是: 是否可以通过 objId 值查找对象,而不是通过 stepId 对其进行分组,最后通过最新的 timestamp 过滤结果?

1 个答案:

答案 0 :(得分:0)

以下是我发现的问题的答案:

private List<MyObject> search(String objId) {
    try {
        SearchResponse searchResponse = esRestClient.search(new SearchRequest("my_index")
                .source(new SearchSourceBuilder()
                        .query(QueryBuilders.matchPhraseQuery("objId", objId))
                        .size(0)
                        .aggregation(
                                AggregationBuilders.terms("by_stepId").field("stepId.keyword")
                                        .subAggregation(AggregationBuilders.topHits("by_timestamp")
                                                .sort("timestamp", SortOrder.DESC)
                                                .size(1)
                                        )
                        )
                )
                .types("object")
        );
        Terms terms = searchResponse.getAggregations().get("by_stepId");
        return terms.getBuckets().stream()
                .map(MultiBucketsAggregation.Bucket::getAggregations)
                .flatMap(buckets -> buckets.asList().stream())
                .map(aggregations -> (ParsedTopHits) aggregations)
                .flatMap(topHits -> Arrays.stream(topHits.getHits().getHits()))
                .map(this::toMyObject)
                .collect(toList());
    } catch (IOException ex) {
        log.error("Error retrieving records from elasticsearch. {} ", ex);
    }
    return new ArrayList<>();
}