Question

我正在寻找Elasticsearch来处理用户在我的网站上进行的搜索查询。

假设我有一个文档person，其中包含字段vehicles_owned，这是一个字符串列表。例如：

{
    "name":"james",
    "surname":"smith",
    "vehicles_owned":["car","bike","ship"]
}

我想查询哪些人拥有某辆车。我了解可以配置ES以便将boat视为ship的同义词，如果我使用boat查询，则会返回用户james拥有一艘船。

我不明白这是自动完成的，还是我必须导入同义词列表。

Answer 1

我们的想法是为vehicles_owned字段创建一个利用synonym token filter的自定义分析器。

所以你首先需要像这样定义索引：

curl -XPUT localhost:9200/your_index -d '{
  "settings": {
    "index": {
      "analysis": {
        "analyzer": {
          "synonym": {
            "tokenizer": "whitespace",
            "filter": [
              "synonym"
            ]
          }
        },
        "filter": {
          "synonym": {
            "type": "synonym",
            "synonyms_path": "synonyms.txt"    <-- your synonym file
          }
        }
      }
    }
  },
  "mappings": {
    "syn": {
      "properties": {
        "name": {
          "type": "string"
        },
        "surname": {
          "type": "string"
        },
        "vehicles_owned": {
          "type": "string",
          "index_analyzer": "synonym"     <-- use the synonym analyzer here
        }
      }
    }
  }
}'

然后，您可以使用支持的formats在$ES_HOME/config/synonyms.txt文件中添加要处理的所有同义词，例如：

boat, ship

接下来，您可以索引文档：

curl -XPUT localhost:9200/your_index/your_type/1 -d '{
    "name":"james",
    "surname":"smith",
    "vehicles_owned":["car","bike","ship"]
}'

最后搜索ship或boat会获得我们刚刚编入索引的上述文档：

curl -XGET localhost:9200/your_index/your_type/_search?q=vehicles_owned:boat
curl -XGET localhost:9200/your_index/your_type/_search?q=vehicles_owned:ship

Elasticsearch自动同义词

1 个答案: