{
"size": 0,
"aggs": {
"categories_agg": {
"terms": {
"field": "categories",
"order": {
"_count": "desc"
}
}
}
}
}
为了获得特定字段的聚合,我使用了上面给出的查询。它工作正常,并给出如下结果:
{
"took": 10,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 77445,
"max_score": 0,
"hits": []
},
"aggregations": {
"categories_agg": {
"doc_count_error_upper_bound": 794,
"sum_other_doc_count": 148316,
"buckets": [
{
"key": "Restaurants",
"doc_count": 25071
},
{
"key": "Shopping",
"doc_count": 11233
},
{
"key": "Food",
"doc_count": 9250
},
{
"key": "Beauty & Spas",
"doc_count": 6583
},
{
"key": "Health & Medical",
"doc_count": 5121
},
{
"key": "Nightlife",
"doc_count": 5088
},
{
"key": "Home Services",
"doc_count": 4785
},
{
"key": "Bars",
"doc_count": 4328
},
{
"key": "Automotive",
"doc_count": 4208
},
{
"key": "Local Services",
"doc_count": 3468
}
]
}
}
}
我是否可以通过这种方式过滤聚合,以便在每个存储桶doc_count
的特定范围内获取存储桶?
e.g。使用doc_count
的范围过滤器,其中max为25000
,min为5000
应该给我
{
"took": 10,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 77445,
"max_score": 0,
"hits": []
},
"aggregations": {
"categories_agg": {
"doc_count_error_upper_bound": 794,
"sum_other_doc_count": 148316,
"buckets": [
{
"key": "Shopping",
"doc_count": 11233
},
{
"key": "Food",
"doc_count": 9250
},
{
"key": "Beauty & Spas",
"doc_count": 6583
},
{
"key": "Health & Medical",
"doc_count": 5121
},
{
"key": "Nightlife",
"doc_count": 5088
}
]
}
}
}
答案 0 :(得分:3)
我通过buckets_selector解决了这个问题。 我们可以在脚本中过滤计数。
```
"aggs": {
"categories_agg": {
"terms": {
"field": "cel_num",
"size": 5000,
"min_doc_count":1
},
"aggs": {
"count_bucket_selector": {
"bucket_selector": {
"buckets_path": {
"count": "_count"
},
"script": {
"lang":"expression",
"inline": "count>5000 && count <10000"
}
}
}
}
}
}
```
答案 1 :(得分:1)
一种简单的解决方案,用于根据Elasticsearch filter aggregations on minimal doc count中的最小doc_count(仅)进行过滤。为了节省您的查找时间:
aggs: {
field1: {
terms: {
field: 'field1',
min_doc_count: 1000
},