Question

我正在使用此映射：

  settings index: { number_of_shards: 1, number_of_replicas: 1 }, analysis: {
    analyzer: {
      custom_analyzer: {
        type: "custom",
        tokenizer: "standard",
        filter: ["lowercase", "asciifolding", "custom_unique_token", "custom_tokenizer"]
      }
    },
    filter: {
      custom_word_delimiter: {
        type: "word_delimiter",
        preserve_original: "true"
      },
      custom_unique_token: {
        type: "unique",
        only_on_same_position: "false"
      },      
      custom_tokenizer: {
        type: "nGram",
        min_gram: "3",
        max_gram: "10",
        token_chars: [ "letter", "digit" ]
      }
    }
  } do
    mappings dynamic: 'false' do
      indexes :searchable, analyzer: "custom_analyzer"
      indexes :year
    end
  end

此查询（rails app）：

search(query: {match: {searchable: {query:params[:text_search], minimum_should_match:"80%"}}, size:100)

我的主要问题，如果应用程序总是返回100个文档（最大需要）。在这100份文件中，只有10或15首文件是相关的。其他文档与搜索词相差太远。

我试过： - 将max_ngram从3增加到10 - 添加最小值应匹配高达99％...... 但我总能得到100个结果。

我真的不明白，为什么，例如，如果我正在搜索“Boucab”，我将首先获得15个好成绩，但我还会在第99位获得“Maucaillou”？如何降低相关性？

我的应用是多语言的。

如何不显示得分较差的结果？我需要使用min_score参数吗？它是唯一的解决方案吗？

ElasticSearch：如何在使用ngrams和匹配查询时减少结果数量？

0 个答案: