Question

如何在Elasticsearch中将单词映射到另一个单词？假设我有以下数据文档

{
"carName" : "Porche"
"review": " this car is so awesome"
}

现在，当我搜索好/奇妙等时，它应该映射到“真棒”。有什么方法可以在elasticsearch中做到这一点吗？

Answer 1

是的，您可以使用synonym token filter来实现此目的。

首先，您需要在索引中定义新的自定义分析器，并在映射中使用该分析器。

curl -XPUT localhost:9200/cars -d '{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "synonyms"
          ]
        }
      },
      "filter": {
        "synonyms": {
          "type": "synonym",
          "synonyms": [
            "good, awesome, fantastic"
          ]
        }
      }
    }
  },
  "mappings": {
    "car": {
      "properties": {
        "carName": {
          "type": "string"
        },
        "review": {
          "type": "string",
          "analyzer": "my_analyzer"
        }
      }
    }
  }
}'

您可以直接在设置中添加任意数量的同义词，也可以使用synonyms_path属性在设置中引用的单独文件中添加。

然后我们可以将您的示例文档编入索引：

curl -XPUT localhost:9200/cars/car/1 -d '{
  "carName": "Porche",
  "review": " this car is so awesome"
}'

当synonyms令牌过滤器启动时，它会将令牌good和fantastic与awesome一起编入索引，以便您可以通过这些令牌搜索并查找该文档。具体而言，分析句子this car is so awesome ...

curl -XGET 'localhost:9200/cars/_analyze?analyzer=my_analyzer&pretty' -d 'this car is so awesome'

...将产生以下令牌（参见最后三个令牌）

{
  "tokens" : [ {
    "token" : "this",
    "start_offset" : 0,
    "end_offset" : 4,
    "type" : "<ALPHANUM>",
    "position" : 1
  }, {
    "token" : "car",
    "start_offset" : 5,
    "end_offset" : 8,
    "type" : "<ALPHANUM>",
    "position" : 2
  }, {
    "token" : "is",
    "start_offset" : 9,
    "end_offset" : 11,
    "type" : "<ALPHANUM>",
    "position" : 3
  }, {
    "token" : "so",
    "start_offset" : 12,
    "end_offset" : 14,
    "type" : "<ALPHANUM>",
    "position" : 4
  }, {
    "token" : "good",
    "start_offset" : 15,
    "end_offset" : 22,
    "type" : "SYNONYM",
    "position" : 5
  }, {
    "token" : "awesome",
    "start_offset" : 15,
    "end_offset" : 22,
    "type" : "SYNONYM",
    "position" : 5
  }, {
    "token" : "fantastic",
    "start_offset" : 15,
    "end_offset" : 22,
    "type" : "SYNONYM",
    "position" : 5
  } ]
}

最后，您可以像这样搜索并检索文档：

curl -XGET localhost:9200/cars/car/_search?q=review:good

如何在elasticsearch中将一个单词映射到另一个单词？

1 个答案: