Spring数据聚合查询elasticsearch

时间:2019-06-28 12:55:40

标签: java elasticsearch spring-data spring-data-elasticsearch

我正在尝试使下面的elasticsearch查询与spring数据一起使用。目的是返回字段"serviceName"的唯一结果。就像SELECT DISTINCT serviceName FROM table与SQL数据库进行比较一样。

{
  "aggregations": {
    "serviceNames": {
      "terms": {
        "field": "serviceName"
      }
    }
  },
  "size":0
}

我将该字段配置为关键字,并根据以下响应片段,使查询在index_name/_search api中的运行非常完美:

"aggregations": {
        "serviceNames": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
                {
                    "key": "service1",
                    "doc_count": 20
                },
                {
                    "key": "service2",
                    "doc_count": 8
                },
                {
                    "key": "service3",
                    "doc_count": 8
                }
            ]
        }
    }

我的问题是,当我尝试使用StringQuery运行时,同一查询在Spring数据中不起作用,但出现以下错误。我猜想它使用不同的api来运行查询。

Cannot execute jest action , response code : 400 , error : {"root_cause":[{"type":"parsing_exception","reason":"no [query] registered for [aggregations]","line":2,"col":19}],"type":"parsing_exception","reason":"no [query] registered for [aggregations]","line":2,"col":19} , message : null

我尝试使用SearchQuery类型来获得相同的结果,没有重复并且没有对象加载,但是我没有运气。下面的正弦片段显示了我是如何尝试的。

final TermsAggregationBuilder aggregation = AggregationBuilders
                .terms("serviceName")
                .field("serviceName")
                .size(1);
        SearchQuery searchQuery = new NativeSearchQueryBuilder()
                .withIndices("index_name")
                  .withQuery(matchAllQuery())
                  .addAggregation(aggregation)
                  .withSearchType(SearchType.DFS_QUERY_THEN_FETCH)
                  .withSourceFilter(new FetchSourceFilter(new String[] {"serviceName"}, new String[] {""}))
                  .withPageable(PageRequest.of(0, 10000))
                  .build();

有人会知道如何在Spring数据上实现无对象加载和对象属性不同的聚合吗?

我尝试了很多尝试,但都没有成功地在spring数据上打印查询,但是我做不到,也许是因为我使用的是com.github.vanroy.springdata.jest.JestElasticsearchTemplate实现。 我有以下查询部分:

logger.info("query:" + searchQuery.getQuery());
logger.info("agregations:" + searchQuery.getAggregations());
logger.info("filter:" + searchQuery.getFilter());
logger.info("search type:" + searchQuery.getSearchType());

它打印:

query:{"match_all":{"boost":1.0}}
agregations:[{"serviceName":{"terms":{"field":"serviceName","size":1,"min_doc_count":1,"shard_min_doc_count":0,"show_term_doc_count_error":false,"order":[{"_count":"desc"},{"_key":"asc"}]}}}]
filter:null
search type:DFS_QUERY_THEN_FETCH

1 个答案:

答案 0 :(得分:0)

我想通了,也许可以帮助别人。聚合不是随查询结果一起提供的,而是其自身的结果,并且未映射到任何对象。出现的对象结果显然是elasticsearch为运行聚合所做的查询示例(不确定,也许)。 最后,我创建了一种方法,可以模拟SQL SELECT DISTINCT your_column FROM your_table上的内容,但是我认为这仅适用于关键字字段,如果我没有记错的话,它们的限制为256个字符。我在评论中解释了一些内容。 感谢@Val,因为我只有在调试为Jest代码并检查生成的请求和原始响应时才能弄清楚它。

public List<String> getDistinctField(String fieldName) {
    List<String> result = new ArrayList<>();

    try {
        final String distinctAggregationName = "distinct_field"; //name the aggregation

        final TermsAggregationBuilder aggregation = AggregationBuilders
                .terms(distinctAggregationName)
                .field(fieldName)
                .size(10000);//limits the number of aggregation list, mine can be huge, adjust yours

        SearchQuery searchQuery = new NativeSearchQueryBuilder()
                .withIndices("your_index")//maybe can be omitted
                .addAggregation(aggregation)
                .withSourceFilter(new FetchSourceFilter(new String[] { fieldName }, new String[] { "" }))//filter it to retrieve only the field we ar interested, probably we can take this out.
                .withPageable(PageRequest.of(0, 1))//can't be zero, and I don't want to load 10 results every time it runs, will always return one object since I found no "size":0 in query builder
                .build();
//had to use the JestResultsExtractor because com.github.vanroy.springdata.jest.JestElasticsearchTemplate don't have an implementation for ResultsExtractor, if you use Spring defaults, you can probably use it.
    final JestResultsExtractor<SearchResult> extractor = new JestResultsExtractor<SearchResult>() {
                @Override
                public SearchResult extract(SearchResult searchResult) {
                    return searchResult;
                }
            };

            final SearchResult searchResult = ((JestElasticsearchTemplate) elasticsearchOperations).query(searchQuery,
                    extractor);
            final MetricAggregation aggregations = searchResult.getAggregations();
            final TermsAggregation termsAggregation = aggregations.getTermsAggregation(distinctAggregationName);//this is where your aggregation results are, in "buckets".
            result = termsAggregation.getBuckets().parallelStream().map(TermsAggregation.Entry::getKey)
                    .collect(Collectors.toList());

        } catch (Exception e) {
            // threat your error here.
            e.printStackTrace();
        }
        return result;

    }