我有一个Elasticsearch实例,它运行着成千上万的文档。我的索引有2个这样的字段:
| ____ Type _____ | __添加日期__ |
|走路2018-11-27T00:00:00.000 |
|走路2018-11-26T00:00:00.000 |
|跑步2018-11-24T00:00:00.000 |
|跑步2018-11-25T00:00:00.000 |
|走路2018-11-27T04:00:00.000 |
我要分组并计算在一定范围内为“类型”字段找到的匹配项数。 在SQL中,我将执行以下操作:
select type,
count(type)
from index
where date_added between '2018-11-20' and '2018-11-30'
group by type
我想得到这样的东西:
|类型计数|
|跑步2 |
|走路3 |
我在项目中使用的是High Level Rest Client api,到目前为止,我的查询看起来像这样,它仅按开始时间和结束时间进行过滤:
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.query(QueryBuilders
.boolQuery()
.must(QueryBuilders
.rangeQuery("date_added")
.from(start.getTime())
.to(end.getTime()))
)
);
如何在“类型”字段中进行“分组依据”?可以在ElasticSearch中做到这一点吗?
答案 0 :(得分:2)
这是一个好的开始!现在,您需要向查询中添加terms
aggregation:
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.query(QueryBuilders.boolQuery()
.must(QueryBuilders
.rangeQuery("date_added")
.from(start.getTime())
.to(end.getTime()))
)
);
// add these two lines
TermsAggregationBuilder groupBy = AggregationBuilders.terms("byType").field("type.keyword");
sourceBuilder.aggregation(groupBy);
答案 1 :(得分:0)
使用Val's reply聚合字段之后,我想将查询的聚合及其值一起打印。这是我所做的:
Terms terms = searchResponse.getAggregations().get("byType");
Collection<Terms.Bucket> buckets = (Collection<Bucket>) terms.getBuckets();
for (Bucket bucket : buckets) {
System.out.println("Type: " + bucket.getKeyAsString() + " = Count("+bucket.getDocCount()+")");
}
这是在包含2700个文档的索引中运行查询后的输出,其中2700个文档的字段名为“ type”和2种不同的类型:
Type: walking = Count(900)
Type: running = Count(1800)