ElasticSearch:聚合过滤

时间:2017-07-29 05:29:01

标签: elasticsearch elasticsearch-aggregation

为简单起见,假设我有3行弹性索引:

{"id": 1, "tags": ["t1", "t2", "t3"]}, 
{"id": 2, "tags": ["t1", "t4", "t5"]}

我需要通过某些标签进行聚合,而不会在匹配的文档中返回其他标记的结果:

{
  "aggs": {
    "tags": {
      "terms": {"field": "tags"}
    }
  },
  "query": {
    "bool": {
      "filter": [
        {
          "terms": {"tags": ["t1", "t2"]}
        }
      ]
    }
  }
}

# RESULT
{
    "aggregations": {
        "tags": {
            "buckets": [
                {"doc_count": 2, "key": "t1"},
                {"doc_count": 1, "key": "t2"},
                {"doc_count": 1, "key": "t3"},  # should be removed by filter
                {"doc_count": 1, "key": "t4"},  # should be removed by filter
                {"doc_count": 1, "key": "t5"},  # should be removed by filter
            ],
        }
    },
    "hits": {
        "hits": [],
        "max_score": 0.0,
        "total": 2
    },
}

如何(可能)postfilter这个结果?

因为在索引中有3行的情况下,这只有3个额外的项目(t3,t4,t5)。但在实际情况下,我有超过200K行的索引,这太可怕了!我需要聚合50个标签,但我得到的结果超过1K标签。

1 个答案:

答案 0 :(得分:1)

假设您的Elasticsearch版本支持它,我应该使用" include"属于聚合一词。您的查询应如上所述:

POST /test/_search
{
  "aggs": {
    "tags": {
      "terms": {"field": "tags",  "include": ["t1", "t2"]}
    }
  },
  "query": {
    "bool": {
      "filter": [
        {
          "terms": {"tags": ["t1", "t2"]}
        }
      ]
    }
  }
}

```