字符串数组中的唯一列表

时间:2018-07-20 01:53:24

标签: json elasticsearch elasticsearch-aggregation

我想要所有文档中的字符串数组中的值的唯一列表。

示例文档:

{
  "_index": li",
  "_type": "profile",
  "_id": "tqvatGQBhAqGE7-_7pdF",
  "nonarrayfield":"person A",
  "attributes": [
      "blah blah 123",
      "112358",
      "quick brown fox"
    ]
},
{
  "_index": "li",
  "_type": "profile",
  "_id": "hqvatGQBhAqGE7-_7pRE",
  "nonarrayfield":"person B",
  "attributes": [
      "blah blah 123",
      "00000",
      "California"
    ]
}

我想要的是唯一的属性列表:

  • “等等123”
  • “ 112358”
  • “快棕色狐狸”
  • “ 00000”
  • “加利福尼亚”

当我尝试基本的聚合查询时,出现“错误:400-所有分片均失败”:

'{
   "aggs":{
    "aggregation_name":{
      "terms":{"field":"attributes"}
    }
   }
  }'

当我在非数组字段上执行相同操作时,查询成功:

'{
   "aggs":{
    "aggregation_name":{
      "terms":{"field":"nonarrayfield"}
    }
   }
  }'

1 个答案:

答案 0 :(得分:0)

将关键字字段用于数组类型,例如

{
    "size":0,
   "aggs":{
    "aggregation_name":{
      "terms":{"field":"attributes.keyword"}
    }
   }
  }

您的结果将类似于

{
    "took": 9,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": 2,
        "max_score": 0,
        "hits": []
    },
    "aggregations": {
        "aggregation_name": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
                {
                    "key": "blah blah 123",
                    "doc_count": 2
                },
                {
                    "key": "00000",
                    "doc_count": 1
                },
                {
                    "key": "112358",
                    "doc_count": 1
                },
                {
                    "key": "California",
                    "doc_count": 1
                },
                {
                    "key": "quick brown fox",
                    "doc_count": 1
                }
            ]
        }
    }
}