Elasticsearch计算嵌套文档中某个值的出现

时间:2019-06-20 14:08:50

标签: elasticsearch

是否有一种方法可以计算在所有文档中可以找到嵌套类型的特定值(例如services.service: "map")的次数?

文档如下所示:

{
  "services": [
    {
      "service": "map",
      "version": 2,
      "provider": "none"
    },
    {
      "service": "map",
      "version": 0,
      "provider": "none"
    },
    {
      "service": "language",
      "version": 2,
      "provider": "none"
    }
  ],
  "date": "2019-04-26T19:17:20.197Z"
}

使用此映射:

{
  "mappings": {
    "stat": {
      "properties": {
        "services": {
          "type": "nested",
          "properties": {
            "version": {
              "type": "keyword"
            },
            "provider": {
              "type": "keyword"
            },
            "service": {
              "type": "keyword"
            }
          }
        },
        "date": {
          "type": "date"
        }
      }
    }
  }
}

我可以返回每个文档的数量(请参见下面的查询),但是我更愿意从我所有的文档中获取最终出现次数的总和。

{
 "query": {
   "nested": {
     "path": "services",
     "query": {
         "match" : {
             "services.service" : "map"
         }
     },
     "inner_hits": {
       "_source" : false,
       "docvalue_fields" : ["services.service.keyword"]
     }
   }
 }
}

1 个答案:

答案 0 :(得分:0)

我认为最好的方法是将嵌套聚合与术语聚合结合使用,因为您的字段是关键字type。您可以将include属性添加到聚合中,以仅获取具有特定值(或作为数组传递时为多个)的存储桶。

针对您发布的案例的查询如下:

GET myindex/_search
{
  "size": 0,
  "aggs": {
    "services": {
      "nested": {
        "path": "services"
      },
      "aggs": {
        "service_count": {
          "terms": {
            "field": "services.service",
            "include": "map"
          }
        }
      }
    }
  }
}

要详细了解嵌套聚合,请see the docs