Question

如here所述，Elasticsearch中定义为“完成”类型以及某个分析器+标记器的字段首先根据这些部分的底层逻辑进行拆分，然后再次“缝合”在一起。但是我对此行为实在不满意。

这是我当前的映射设置（摘录）：

"mappings": {
    "movie": {
      "properties": {
        "title": {
          "analyzer": "standard",
          "fields": {
            "autocomplete": {
              "type": "completion"
              "analyzer": "whitespace",
            }
          },
          "type": "string"
        }
      }
    }
}

让我们以标题为Harry Potter的电影为例：

当我输入前缀Har时，会得到建议Harry Potter。当我输入Pot时，我什么都没有，因为在分析/标记之后，各个标记Harry和Potter被立即缝合到Harry Potter上。

现在，我想要的是以下行为：当我输入Pot时，我希望完成提示器返回Potter。不是Harry Potter，而是Potter。这可能吗？警告：我什至不需要引用创建建议的文档。因此，如果可以将所有生成的令牌扔进一个锅中，然后从那里检索建议，那将是很棒的（由于我必须做一些其他事情）。

Answer 1

我正在使用edge_ngram标记器执行类似的操作。这是official documentation

您的设置需要包含以下内容：

{
  "settings" : {
    "index" : {
      "number_of_shards" : "5",
      "analysis" : {
        "analyzer" : {
          "autocomplete": {
            "type": "custom",
            "tokenizer": "autocomplete",
            "filter": [
                "lowercase"
            ]
          }
        },
        "tokenizer": {
          "autocomplete": {
            "type": "edge_ngram",
            "min_gram": 3,
            "max_gram": 20,
            "token_chars": [
              "letter",
              "digit"
            ]
          }
        }
      }
    }
  }
}

和您的映射将需要完善，以便“ analyser”：“ autocomplete”

Elasticsearch完成建议：真正的令牌化可能吗？

1 个答案: