Question

我正尝试在以下示例中复制filtered aggregation：在这里，我试图过滤与管道名称匹配的文档，并找到在这些管道中执行的最大持续时间。

{
    "_source" : {"excludes": ["stderr"]},
    "aggs" : {
        "max_duration_filtered" : {
            "filter" : {
                "term": {
                        "pipeline": "{name_of_pipeline}"
                }
            },
            "aggs" : {
                "max_duration" : {
                        "max" : {
                                "field" : "duration"
                        }
                }
            }
        }
    }
}

调用此命令将返回以下输出以及1个匹配项（我也传入了size = 1）

{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 63643,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "{name-of-index}",
        "_type" : "raw_data",
        "_id" : "{an-id}",
        "_score" : 1.0,
        "_source" : {
          "duration" : 42.8,
          "pipeline" : "{a-different-pipeline}",
          "buildNumber" : {build-number-integer}
        }
      }
    ]
  },
  "aggregations" : {
    "max_duration_filtered" : {
      "doc_count" : 0,
      "max_duration" : {
        "value" : null
      }
    }
  }
}

我真的很想了解为什么最大持续时间值为null。似乎我非常仔细地反映了文档中的内容。有什么我可以尝试解决的问题？谢谢！

Answer 1

以下查询将为您提供每个管道的最大持续时间。无需通过特定管道进行过滤。

{
  "size": 0,
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "pipeline.keyword": "some-pipeline"
          }
        }
      ]
    }
  },
  "aggs": {
    "pipelines": {
      "terms": {
        "field": "pipeline.keyword",
        "size": 100
      },
      "aggs": {
        "max_duration": {
          "max": {
            "field": "duration"
          }
        }
      }
    }
  }
}

过滤后的汇总未返回任何结果

1 个答案: