排除空子桶ElasticSearch

时间:2016-07-12 11:47:27

标签: elasticsearch filter

我写了两个级别的聚合查询:

{
  "size": 0,
  "aggregations": {
    "colors": {
      "terms": {
        "field": "color"
      },
      "aggregations": {
        "timestamps": {
          "date_histogram": {
            "field": "timestamp",
            "interval": "1m",
            "order": {
              "_key": "desc"
            }
          },
          "aggregations": {
            "timestamps_bucket_filter": {
              "bucket_selector": {
                "buckets_path": {
                  "counterts": "_count"
                },
                "script": {
                  "lang": "expression",
                  "script": "counterts == 0"
                }
              }
            }
          }
        }
      }
    }
  }
}

可以看出,我只过滤了所有子文件(时间戳),只有零文件。 问题是,在上面的过滤之后,在高级别桶(颜色)处有空桶。

例如:

.
.
.
"aggregations": {
  "colors": {
    "doc_count_error_upper_bound": 12144,
    "sum_other_doc_count": 14785757,
    "buckets": [
      .
      .
      .,
      {
        "key": "Yellow",    // <<-- this is an empty bucket I would like to exclude
        "doc_count": 57223,
        "timestamps": {
          "buckets": [
                            // <<-- empty
          ]
        }
      },
      .
      .
      .

如何从任何时间戳子桶中排除所有仍为空的颜色桶?

提前致谢!

3 个答案:

答案 0 :(得分:1)

回到这一点,我发现有一个相当新的问题和PR为_bucket_count选项添加buckets_path路径,以便聚合可以过滤父桶基于另一个聚合具有的桶数。换句话说,如果父{1}} _bucket_count为0,则应删除该存储桶。

这是github问题:https://github.com/elastic/elasticsearch/issues/19553

答案 1 :(得分:0)

根据this page,您可以指定要包含的存储桶的最小文档计数:

     "date_histogram": {
        "field": "timestamp",
        "interval": "1m",
        "min_doc_count" : 1,
        "order": {
          "_key": "desc"
        }
      },

答案 2 :(得分:0)

正如发布的那样,5.x 已经包含这个功能。 使用 _bucket_count,您可以使用:

.......
"aggregations": {
    "colors": {
      "terms": {
        "field": "color"
      },
      "count_colors": {
          "bucket_selector": {
            "buckets_path": {
              "count": "colors._bucket_count"
            },
            "script": {
              "source": "params.count > 0"
            }
          }
        }
    }
}
.......