Question

我尝试使用下面的查询对下面的数据进行弹性搜索来执行术语聚合，输出会将名称分解为标记（请参阅下面的输出）。所以我尝试将os_name映射为multi_field，现在我无法通过它进行查询。是否有可能没有令牌的索引？例如＆＃34; Fedora Core＆＃34;？

查询：

GET /temp/example/_search
{
  "size": 0,
  "aggs": {
     "OS": {
       "terms": {
           "field": "os_name"
       }
     }
  }
}

数据：

...
    {
        "_index": "temp",
        "_type": "example",
        "_id": "3",
        "_score": 1,
        "_source": {
           "title": "system3",
           "os_name": "Fedora Core",
           "os_version": 18
        }
     },
     {
        "_index": "temp",
        "_type": "example",
        "_id": "1",
        "_score": 1,
        "_source": {
           "title": "system1",
           "os_name": "Fedora Core",
           "os_version": 20
        }
     },
     {
        "_index": "temp",
        "_type": "example",
        "_id": "2",
        "_score": 1,
        "_source": {
           "title": "backup",
           "os_name": "Yellow Dog",
           "os_version": 6
        }
     }
...

输出：

       ...
        {
           "key": "core",
           "doc_count": 2
        },
        {
           "key": "fedora",
           "doc_count": 2
        },
        {
           "key": "dog",
           "doc_count": 1
        },
        {
           "key": "yellow",
           "doc_count": 1
        }
       ...

映射：

PUT /temp
{
  "mappings": {
    "example": {
      "properties": {
        "os_name": {
          "type": "string"
        },
        "os_version": {
          "type": "long"
        },
        "title": {
          "type": "string"
        }
      }
    }
  }
}

Answer 1

实际上你应该像这样改变你的映射

"os_name": {
  "type": "string",
  "fields": {
     "raw": {
        "type": "string",
        "index": "not_analyzed"
     }
  }
},

并且您的aggs应该更改为：

GET /temp/example/_search
{
  "size": 0,
  "aggs": {
     "OS": {
       "terms": {
           "field": "os_name.raw"
       }
     }
  }
}

Answer 2

一个可行的解决方案是将字段设置为not_analyzed（在the docs for attribute "index"中详细了解它。）

此解决方案根本不会分析输入，具体取决于您可能希望设置custom analyzer的要求，例如不分割单词，但小写它们，以获得不区分大小写的结果。

curl -XDELETE localhost:9200/temp
curl -XPUT localhost:9200/temp -d '
{
  "mappings": {
    "example": {
      "properties": {
        "os_name": {
          "type": "string",
          "index" : "not_analyzed"
        },
        "os_version": {
          "type": "long"
        },
        "title": {
          "type": "string"
        }
      }
    }
  }
}'

curl -XPUT localhost:9200/temp/example/1 -d '
{
    "title": "system3",
    "os_name": "Fedora Core",
    "os_version": 18
}'

curl -XPUT localhost:9200/temp/example/2 -d '
{
    "title": "system1",
    "os_name": "Fedora Core",
    "os_version": 20
}'

curl -XPUT localhost:9200/temp/example/3 -d '
{
    "title": "backup",
    "os_name": "Yellow Dog",
    "os_version": 6
}'

curl -XGET localhost:9200/temp/example/_search?pretty=true -d '
{
  "size": 0,
  "aggs": {
     "OS": {
       "terms": {
           "field": "os_name"
       }
     }
  }
}'

输出：

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "OS" : {
      "buckets" : [ {
        "key" : "Fedora Core",
        "doc_count" : 2
      }, {
        "key" : "Yellow Dog",
        "doc_count" : 1
      } ]
    }
  }
}

ElasticSearch术语聚合

2 个答案: