Question

我有多个Elasticsearch 1.3.2索引，并且我使用自定义文档ID。我想在我的索引中找到不同ID的数量。有些文档具有相同的ID但是具有不同的索引，因此这与仅计算文档不同。所以我想在_id字段上进行基数聚合。所以我将其发布到http://localhost:9200/*my_indices*/_search：

{ "from": 0, "size": 0, "aggregations": { "_count": { "cardinality": { "script": "doc['_id'].value", "lang": "groovy" } } } }

但是Elasticsearch刚刚发回了这个：

{ "took": 60, "timed_out": false, "_shards": { "total": 175, "successful": 175, "failed": 0 }, "hits": { "total": 310714, "max_score": 0, "hits": [] }, "aggregations": { "_count": { "value": 0 } }

我非常确定那里有超过0个ID！发生了什么，是否有可能得到我想要的东西？

Answer 1

还有另一种解决方案，无需重新索引所有内容，即改为使用_uid字段：

{
  "from": 0,
  "size": 0,
  "aggregations": {
    "_count": {
      "cardinality": {
        "field": "_uid"
      }
    }
  }
}

Answer 2

_id字段为not analyzed and not stored, by default。而且我认为它也没有存储在_source中。您不能按原样使用聚合。

对于您的索引，您需要对其进行更改，以使_id编入索引：

  "_id": {
    "index": "not_analyzed"
  }

我可以计算Elasticsearch中_id字段的基数吗？

2 个答案: