我写了两个级别的聚合查询:
{
"size": 0,
"aggregations": {
"colors": {
"terms": {
"field": "color"
},
"aggregations": {
"timestamps": {
"date_histogram": {
"field": "timestamp",
"interval": "1m",
"order": {
"_key": "desc"
}
},
"aggregations": {
"timestamps_bucket_filter": {
"bucket_selector": {
"buckets_path": {
"counterts": "_count"
},
"script": {
"lang": "expression",
"script": "counterts == 0"
}
}
}
}
}
}
}
}
}
可以看出,我只过滤了所有子文件(时间戳),只有零文件。 问题是,在上面的过滤之后,在高级别桶(颜色)处有空桶。
例如:
.
.
.
"aggregations": {
"colors": {
"doc_count_error_upper_bound": 12144,
"sum_other_doc_count": 14785757,
"buckets": [
.
.
.,
{
"key": "Yellow", // <<-- this is an empty bucket I would like to exclude
"doc_count": 57223,
"timestamps": {
"buckets": [
// <<-- empty
]
}
},
.
.
.
如何从任何时间戳子桶中排除所有仍为空的颜色桶?
提前致谢!
答案 0 :(得分:1)
回到这一点,我发现有一个相当新的问题和PR为_bucket_count
选项添加buckets_path
路径,以便聚合可以过滤父桶基于另一个聚合具有的桶数。换句话说,如果父{1}} _bucket_count
为0,则应删除该存储桶。
这是github问题:https://github.com/elastic/elasticsearch/issues/19553
答案 1 :(得分:0)
根据this page,您可以指定要包含的存储桶的最小文档计数:
"date_histogram": {
"field": "timestamp",
"interval": "1m",
"min_doc_count" : 1,
"order": {
"_key": "desc"
}
},
答案 2 :(得分:0)
正如发布的那样,5.x 已经包含这个功能。 使用 _bucket_count,您可以使用:
.......
"aggregations": {
"colors": {
"terms": {
"field": "color"
},
"count_colors": {
"bucket_selector": {
"buckets_path": {
"count": "colors._bucket_count"
},
"script": {
"source": "params.count > 0"
}
}
}
}
}
.......