Question

我们有以下先决条件：在具有标签字段的ES上建立索引的文档是字符串数组，例如：['visa'，'credit card'] 我们想在标签字段中搜索这些文档。

要求：

如果文档具有标记：['visa'，'credit card']，我们只想在用户写有'visa'或'credit card'的情况下将其返回，接受'card'，'credit'和类似的部分，因此组成的词必须完全匹配。
我们希望在搜索单个和复合词的标签字段时感到模糊。
我们想在标签字段上使用同义词。

所以我实现了：

"tags_analyzer": {
  "filter": [
    "lowercase",
    "asciifolding",
    "synonyms_expand",
  ],
  "char_filter": [
    "quotes",
    "html_strip",
    "ampersand",
    "returns"
  ],
  "type": "custom",
  "tokenizer": "keyword"
},

"query_analyzer": {
  "filter": [
    "lowercase",
    "my_asciifolding",
    "shingle"
  ],
  "char_filter": [
    "quotes",
    "html_strip",
    "ampersand",
    "returns"
  ],
  "type": "custom",
  "tokenizer": "standard"
},

"synonyms_expand": {
  "ignore_case": "true",
  "expand": "true",
  "type": "synonym",
  "synonyms": [
    "visa, credit card",
    "maestro, debit card"
  ],
  "tokenizer": "keyword"
},

"shingle": {
  "max_shingle_size": "3",
  "min_shingle_size": "2",
  "output_unigrams": "true",
  "type": "shingle",
  "filler_token": ""
}

在索引时间使用tags_analyzer，在查询时间使用query_analyzer。但是此解决方案不适用于模糊复合项。有谁知道为什么还是有其他解决方案？

ElasticSearch匹配多词关键字标记化字段

0 个答案: