使用Word分隔符过滤器进行评分

时间:2013-05-29 19:54:14

标签: ruby-on-rails elasticsearch tire

即使我使用“explain”,我也不明白我的查询是如何得分的。分数似乎是任意的 - 也许对弹性搜索有更好理解的人可以向我解释一下吗?

我的设置和映射看起来像这样(我正在使用轮胎):

  settings :analysis => {
            :analyzer => {
              :nickname => {
                :tokenizer => "standard",
                :filter    => ["stop", "nickname_words", "lowercase"],
                :type      => "custom"
              }
            },
            :filter => {
              :nickname_words  => { :type => "word_delimiter", :generate_word_parts => true, :generate_number_parts => true,
                                    :split_on_numerics => true, :split_on_case_change => true, :catenate_words => true,
                                    :preserve_original => true }
            }
          }

  mapping do
    indexes :id,       :type => "integer", :index => :not_analyzed, :include_in_all => false
    indexes :nickname, :type => "string",  :index_analyzer => :nickname, :search_analyzer => :standard, :include_in_all => true
  end

例如,如果我对昵称“test”进行匹配查询,我就会回来:

{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : 1.0,
    "hits" : [ {
      "_shard" : 2,
      "_node" : "YE2N_R_qRMaXmcdobaMkWQ",
      "_index" : "users",
      "_type" : "user",
      "_id" : "6",
      "_score" : 1.0, "_source" : {"id":6,"nickname":"test242424"},
      "_explanation" : {
        "value" : 1.0,
        "description" : "weight(nickname:test in 0) [PerFieldSimilarity], result of:",
        "details" : [ {
          "value" : 1.0,
          "description" : "fieldWeight in 0, product of:",
          "details" : [ {
            "value" : 1.0,
            "description" : "tf(freq=1.0), with freq of:",
            "details" : [ {
              "value" : 1.0,
              "description" : "termFreq=1.0"
            } ]
          }, {
            "value" : 1.0,
            "description" : "idf(docFreq=1, maxDocs=2)"
          }, {
            "value" : 1.0,
            "description" : "fieldNorm(doc=0)"
          } ]
        } ]
      }
    }, {
      "_shard" : 3,
      "_node" : "YE2N_R_qRMaXmcdobaMkWQ",
      "_index" : "users",
      "_type" : "user",
      "_id" : "7",
      "_score" : 1.0, "_source" : {"id":7,"nickname":"SecondTest353535"},
      "_explanation" : {
        "value" : 1.0,
        "description" : "weight(nickname:test in 0) [PerFieldSimilarity], result of:",
        "details" : [ {
          "value" : 1.0,
          "description" : "fieldWeight in 0, product of:",
          "details" : [ {
            "value" : 1.0,
            "description" : "tf(freq=1.0), with freq of:",
            "details" : [ {
              "value" : 1.0,
              "description" : "termFreq=1.0"
            } ]
          }, {
            "value" : 1.0,
            "description" : "idf(docFreq=1, maxDocs=2)"
          }, {
            "value" : 1.0,
            "description" : "fieldNorm(doc=0)"
          } ]
        } ]
      }
    }, {
      "_shard" : 0,
      "_node" : "YE2N_R_qRMaXmcdobaMkWQ",
      "_index" : "users",
      "_type" : "user",
      "_id" : "4",
      "_score" : 0.30685282, "_source" : {"id":4,"nickname":"test123"},
      "_explanation" : {
        "value" : 0.30685282,
        "description" : "weight(nickname:test in 0) [PerFieldSimilarity], result of:",
        "details" : [ {
          "value" : 0.30685282,
          "description" : "fieldWeight in 0, product of:",
          "details" : [ {
            "value" : 1.0,
            "description" : "tf(freq=1.0), with freq of:",
            "details" : [ {
              "value" : 1.0,
              "description" : "termFreq=1.0"
            } ]
          }, {
            "value" : 0.30685282,
            "description" : "idf(docFreq=1, maxDocs=1)"
          }, {
            "value" : 1.0,
            "description" : "fieldNorm(doc=0)"
          } ]
        } ]
      }
    } ]
  }
}                                                                   

我希望test123能够在SecondTest353535和test24242424之前出现,因为它最相似。

为什么会这样?

0 个答案:

没有答案