ElasticSearch术语聚合

时间:2014-05-20 20:30:19

标签: elasticsearch elasticsearch-marvel

我尝试使用下面的查询对下面的数据进行弹性搜索来执行术语聚合,输出会将名称分解为标记(请参阅下面的输出)。所以我尝试将os_name映射为multi_field,现在我无法通过它进行查询。是否有可能没有令牌的索引?例如" Fedora Core"?

查询:

GET /temp/example/_search
{
  "size": 0,
  "aggs": {
     "OS": {
       "terms": {
           "field": "os_name"
       }
     }
  }
}

数据:

...
    {
        "_index": "temp",
        "_type": "example",
        "_id": "3",
        "_score": 1,
        "_source": {
           "title": "system3",
           "os_name": "Fedora Core",
           "os_version": 18
        }
     },
     {
        "_index": "temp",
        "_type": "example",
        "_id": "1",
        "_score": 1,
        "_source": {
           "title": "system1",
           "os_name": "Fedora Core",
           "os_version": 20
        }
     },
     {
        "_index": "temp",
        "_type": "example",
        "_id": "2",
        "_score": 1,
        "_source": {
           "title": "backup",
           "os_name": "Yellow Dog",
           "os_version": 6
        }
     }
...

输出:

       ...
        {
           "key": "core",
           "doc_count": 2
        },
        {
           "key": "fedora",
           "doc_count": 2
        },
        {
           "key": "dog",
           "doc_count": 1
        },
        {
           "key": "yellow",
           "doc_count": 1
        }
       ...

映射:

PUT /temp
{
  "mappings": {
    "example": {
      "properties": {
        "os_name": {
          "type": "string"
        },
        "os_version": {
          "type": "long"
        },
        "title": {
          "type": "string"
        }
      }
    }
  }
}

2 个答案:

答案 0 :(得分:7)

实际上你应该像这样改变你的映射

"os_name": {
  "type": "string",
  "fields": {
     "raw": {
        "type": "string",
        "index": "not_analyzed"
     }
  }
},

并且您的aggs应该更改为:

GET /temp/example/_search
{
  "size": 0,
  "aggs": {
     "OS": {
       "terms": {
           "field": "os_name.raw"
       }
     }
  }
}

答案 1 :(得分:4)

一个可行的解决方案是将字段设置为not_analyzed(在the docs for attribute "index"中详细了解它。)

此解决方案根本不会分析输入,具体取决于您可能希望设置custom analyzer的要求,例如不分割单词,但小写它们,以获得不区分大小写的结果。

curl -XDELETE localhost:9200/temp
curl -XPUT localhost:9200/temp -d '
{
  "mappings": {
    "example": {
      "properties": {
        "os_name": {
          "type": "string",
          "index" : "not_analyzed"
        },
        "os_version": {
          "type": "long"
        },
        "title": {
          "type": "string"
        }
      }
    }
  }
}'

curl -XPUT localhost:9200/temp/example/1 -d '
{
    "title": "system3",
    "os_name": "Fedora Core",
    "os_version": 18
}'

curl -XPUT localhost:9200/temp/example/2 -d '
{
    "title": "system1",
    "os_name": "Fedora Core",
    "os_version": 20
}'

curl -XPUT localhost:9200/temp/example/3 -d '
{
    "title": "backup",
    "os_name": "Yellow Dog",
    "os_version": 6
}'

curl -XGET localhost:9200/temp/example/_search?pretty=true -d '
{
  "size": 0,
  "aggs": {
     "OS": {
       "terms": {
           "field": "os_name"
       }
     }
  }
}'

输出:

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "OS" : {
      "buckets" : [ {
        "key" : "Fedora Core",
        "doc_count" : 2
      }, {
        "key" : "Yellow Dog",
        "doc_count" : 1
      } ]
    }
  }
}