从所有文档elasticsearch获取所有带有计数的标签

时间:2017-01-28 16:10:30

标签: elasticsearch

我的索引mp_v1包含源字段:id和tags。 "标签"字段包含文档中的所有标记。

示例:

{
        "_index": "mp_v1",
        "_type": "mp",
        "_id": "5",
        "_score": 1,
        "_source": {
          "id": 5,
          "tags": "tag1 black blue"
        }
}

如何从所有文档中出现的弹性搜索标签中获取?例如,如果我有两个文件,第一个带有标签" tag1 black blue"第二个是标签"蓝色方块"它应该返回:blue:2,tag1:1,black:1,square:1

2 个答案:

答案 0 :(得分:3)

我正在运行ES 5.12

PUT testindex_51
{
    "settings": {
        "analysis": {
            "analyzer": {
            },
             "filter":{
        }
        }
    },
    "mappings": {
        "table1": {
            "properties": {
                "title": {
                    "type": "text",
                    "analyzer": "whitespace",
                    "fielddata": true
                }
            }
        }
    }
}

POST testindex_50/table1
{
  "title" : "tag1 aggs1 blue"
}

POST testindex_50/table1
{
  "title" : "tag2 aggs2 blue"
}

POST testindex_50/table1/_search
{
  "aggs": {
    "tags_count": {
      "terms": {
        "field": "title",
        "size": 10
      }
    }
  }
}

回复

{
  "took": 11,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "tags_count": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "blue",
          "doc_count": 2
        },
        {
          "key": "aggs1",
          "doc_count": 1
        },
        {
          "key": "aggs2",
          "doc_count": 1
        },
        {
          "key": "tag1",
          "doc_count": 1
        },
        {
          "key": "tag2",
          "doc_count": 1
        }
      ]
    }
  }
}

答案 1 :(得分:0)

您可以简单地使用简单的术语聚合来启用fielddata(脏方式)。

但建议使用分解字段然后执行聚合。