Question

遇到一个问题让我觉得我不完全理解ElasticSearch 5.5中的索引与搜索时间分析。

假设我有一个只有name和state的人的基本索引。为简单起见，我将al => alabama设置为唯一的状态同义词。

PUT people
{
  "mappings": {
    "person": {
      "properties": {
        "name": {
          "type": "text"
        },
        "state": {
          "type": "text",
          "analyzer": "us_state"
        }
      }
    }
  },
  "settings": {
    "analysis": {
      "filter": {
        "state_synonyms": {
          "type": "synonym",
          "synonyms": "al => alabama"
        }
      },
      "analyzer": {
        "us_state": {
          "filter": [
            "standard",
            "lowercase",
            "state_synonyms"
          ],
          "type": "custom",
          "tokenizer": "standard"
        }
      }
    }
  }
}

我的理解是，当我索引文档时，state字段数据将被索引为扩展的同义词形式。这可以运行测试：

GET people/_analyze
{
  "text": "al",
  "field": "state"
}

返回

{
  "tokens": [
    {
      "token": "alabama",
      "start_offset": 0,
      "end_offset": 2,
      "type": "SYNONYM",
      "position": 0
    }
  ]
}

看起来不错，让我们索引一份文件：

POST people/person
{
  "name": "dave",
  "state": "al"
}

并执行搜索：

GET people/person/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "term": {
            "state": "al"
          }
        }
      ]
    }
  }
}

不返回任何内容：

{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 0,
    "max_score": null,
    "hits": []
  }
}

我希望我的搜索中的al能够通过相同的us_state分析器运行并匹配我的文档。但是，如果我将查询更改为：

，搜索确实有效

"term": { "state": "alabama" }

Answer 1

这是因为您使用了term查询，该查询不分析输入。您应该将其更改为使用match查询，而且一切都会正常

GET people/person/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "state": "al"
          }
        }
      ]
    }
  }
}

ElasticSearch索引与搜索时间分析器

1 个答案: