按数组项聚合

时间:2017-08-31 19:46:08

标签: elasticsearch

我的Elasticsearch聚合查询有问题。 数据样本:

PUT test
POST test/customer
{
  "name": "John",
  "cities": ["NYC", "Paris"],
  "sId": 1
}
POST test/customer
{
  "name": "Steve",
  "cities": ["NYC"],
  "sId": 2
}
POST test/customer
{
  "name": "John",
  "cities": ["Paris", "Cape Town"],
  "sId": 3
}
GET test/customer/_search
{
  "query": {
    "match_all": {}
  }
}

我想得到的是同一个人在这种格式中的位置和次数:

{
    key: "John_Paris",
    doc_count: 2
},
{
    key: "John_NYC",
    doc_count: 1
},  {
    key: "John_Cape Town",
    doc_count: 1
},
{
    key: "Steve_NYC",
    doc_count: 1
}

我坚持这个但是错了:

POST test/_search
{
  "size": 0,
  "aggs": {
    "duplicateCount": {
      "terms": {
        "script": {
          "lang": "painless",
          "inline": "return [doc['name.keyword'].value, doc['cities.keyword'].value].join('_')"
        },
        "size": 2000000000,
        "min_doc_count": 1
      }
    }
  }
}

它返回:

{
  "key": "John_Cape Town",
  "doc_count": 1
},
{
  "key": "John_NYC",
  "doc_count": 1
},
{
  "key": "Steve_NYC",
  "doc_count": 1
}

并且没有包含John_Paris项目:

{
    key: "John_Paris",
    doc_count: 2
}

如何实现?

提前致谢

更新

所以,ANSWER让我保持预期格式的结果是创建一个带有键的数组并将其返回到脚本inline中,如下所示:

POST test/_search
{
  "size": 0,
  "aggs": {
    "duplicateCount": {
      "terms": {
        "script": {
          "lang": "painless",
          "inline": "def keys = []; for (p in doc['cities.keyword'].values) { keys.add(doc['name.keyword'].value + '_'  + p);} return keys;"
        },
        "size": 2000000000,
        "min_doc_count": 1
      }
    }
  }
}

1 个答案:

答案 0 :(得分:2)

您可以嵌套聚合。它会以不同的格式给你结果,但与之类似。 试试这个:

{
   "aggs":{
      "name":{
         "terms":{
            "field":"name.keyword"
         },
         "aggs":{
            "cities":{
               "terms":{
                  "field":"cities.keyword"
               }
            }
         }
      }
   }
}