带有连字符值的弹性搜索聚合分为单独的值

时间:2015-01-12 10:35:09

标签: elasticsearch

我试图从Elasticsearch检索标记的聚合(带有计数),但是在我有连字标记的情况下,它们会被拆分作为单独的标记返回。

E.g。

{
    "tags": ['foo', 'foo-bar', 'cheese']
}

我回来了(删节):

{
  'foo': 8,
  'bar': 3,
  'cheese' : 2
}

当我期待得到:

{
  'foo': 5,
  'foo-bar': 3,
  'cheese' : 2
}

我的映射是:

{
    "asset" : {
        "properties" : {
            "name" : {"type" : "string"},
            "path" : {"type" : "string", "index" : "not_analyzed"},
            "url": {"type" : "string"},
            "tags" : {"type" : "string", "index_name" : "tag"},
            "created": {"type" : "date"},
            "updated": {"type" : "date"},
            "usages": {"type" : "string", "index_name" : "usage"},
            "meta": {"type": "object"}
        }
    }
}

有人能指出我正确的方向吗?

1 个答案:

答案 0 :(得分:1)

尝试使用另一个分析器,而不是在遇到某些字符时分割单词的标准分析器:

{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_keyword_lowercase": {
          "tokenizer": "keyword",
          "filter": [
            "lowercase",
            "trim"
          ]
        }
      }
    }
  },
  "mappings": {
    "asset" : {
        "properties" : {
            "name" : {"type" : "string"},
            "path" : {"type" : "string", "index" : "not_analyzed"},
            "url": {"type" : "string"},
            "tags" : {"type" : "string", "index_name" : "tag", "analyzer":"my_keyword_lowercase"},
            "created": {"type" : "date"},
            "updated": {"type" : "date"},
            "usages": {"type" : "string", "index_name" : "usage"},
            "meta": {"type": "object"}
        }
    }
  }
}