为什么queryWeight包含在同一查询中的某些结果分数,而不包括其他结果分数?

时间:2014-01-19 06:32:24

标签: elasticsearch lucene

我正在执行一个query_string查询,其中包含多个字段_alltags.name上的一个字词,并尝试了解评分。查询:{"query":{"query_string":{"query":"animal","fields":["_all","tags.name"]}}}。以下是查询返回的文档:

  • 文档1 tags.name完全匹配,但不在_all上。
  • 8号文档tags.name_all完全匹配。

文件8应该获胜,而且确实如此,但我对得分如何运作感到困惑。似乎文档1因其tags.name分数乘以IDF两次而受到惩罚,而文档8的tags.name分数仅乘以IDF一次。简而言之:

  • 他们都有一个组件weight(tags.name:animal in 0) [PerFieldSimilarity]
  • 在文档1中,我们有weight = score = queryWeight x fieldWeight
  • 在文档8中,我们有weight = fieldWeight

由于queryWeight包含idf,因此会导致文档1被其idf两次处罚。

有人能理解这个吗?

其他信息

  • 如果我从查询字段中删除_all,则queryWeight完全不在解释中。
  • 添加"use_dis_max":true作为选项无效。
    • 但是,另外添加"tie_breaker":0.7(或任何值) 会影响文档8,方法是为文档1提供更复杂的公式。
    • 思考:布尔查询(这是)可能会故意这样做,以便为匹配多个子查询的查询提供更多权重,这似乎是合理的。但是,这对于dis_max查询没有任何意义,因为它只能返回子查询的最大值。

以下是相关的解释请求。寻找嵌入式评论。

文档1 (仅在tags.name上匹配):

curl -XGET 'http://localhost:9200/questions/question/1/_explain?pretty' -d '{"query":{"query_string":{"query":"animal","fields":["_all","tags.name"]}}}'

{
  "ok" : true,
  "_index" : "questions_1390104463",
  "_type" : "question",
  "_id" : "1",
  "matched" : true,
  "explanation" : {
    "value" : 0.058849156,
    "description" : "max of:",
    "details" : [ {
      "value" : 0.058849156,
      "description" : "weight(tags.name:animal in 0) [PerFieldSimilarity], result of:",
      // weight = score = queryWeight x fieldWeight
      "details" : [ {
        // score and queryWeight are NOT a part of the other explain!
        "value" : 0.058849156,
        "description" : "score(doc=0,freq=1.0 = termFreq=1.0\n), product of:",
        "details" : [ {
          "value" : 0.30685282,
          "description" : "queryWeight, product of:",
          "details" : [ {
            // This idf is NOT a part of the other explain!
            "value" : 0.30685282,
            "description" : "idf(docFreq=1, maxDocs=1)"
          }, {
            "value" : 1.0,
            "description" : "queryNorm"
          } ]
        }, {
          "value" : 0.19178301,
          "description" : "fieldWeight in 0, product of:",
          "details" : [ {
            "value" : 1.0,
            "description" : "tf(freq=1.0), with freq of:",
            "details" : [ {
              "value" : 1.0,
              "description" : "termFreq=1.0"
            } ]
          }, {
            "value" : 0.30685282,
            "description" : "idf(docFreq=1, maxDocs=1)"
          }, {
            "value" : 0.625,
            "description" : "fieldNorm(doc=0)"
          } ]
        } ]
      } ]
    } ]
  }

8号文档_alltags.name匹配):

curl -XGET 'http://localhost:9200/questions/question/8/_explain?pretty' -d '{"query":{"query_string":{"query":"animal","fields":["_all","tags.name"]}}}'

{
  "ok" : true,
  "_index" : "questions_1390104463",
  "_type" : "question",
  "_id" : "8",
  "matched" : true,
  "explanation" : {
    "value" : 0.15342641,
    "description" : "max of:",
    "details" : [ {
      "value" : 0.033902764,
      "description" : "btq, product of:",
      "details" : [ {
        "value" : 0.033902764,
        "description" : "weight(_all:anim in 0) [PerFieldSimilarity], result of:",
        "details" : [ {
          "value" : 0.033902764,
          "description" : "fieldWeight in 0, product of:",
          "details" : [ {
            "value" : 0.70710677,
            "description" : "tf(freq=0.5), with freq of:",
            "details" : [ {
              "value" : 0.5,
              "description" : "phraseFreq=0.5"
            } ]
          }, {
            "value" : 0.30685282,
            "description" : "idf(docFreq=1, maxDocs=1)"
          }, {
            "value" : 0.15625,
            "description" : "fieldNorm(doc=0)"
          } ]
        } ]
      }, {
        "value" : 1.0,
        "description" : "allPayload(...)"
      } ]
    }, {
      "value" : 0.15342641,
      "description" : "weight(tags.name:animal in 0) [PerFieldSimilarity], result of:",
      // weight = fieldWeight
      // No score or queryWeight in sight!
      "details" : [ {
        "value" : 0.15342641,
        "description" : "fieldWeight in 0, product of:",
        "details" : [ {
          "value" : 1.0,
          "description" : "tf(freq=1.0), with freq of:",
          "details" : [ {
            "value" : 1.0,
            "description" : "termFreq=1.0"
          } ]
        }, {
          "value" : 0.30685282,
          "description" : "idf(docFreq=1, maxDocs=1)"
        }, {
          "value" : 0.5,
          "description" : "fieldNorm(doc=0)"
        } ]
      } ]
    } ]
  }
}

1 个答案:

答案 0 :(得分:0)

我没有回答。只想提一下我向Elasticsearch论坛发帖的问题:https://groups.google.com/forum/#!topic/elasticsearch/xBKlFkq0SP0 当我得到答案时,我会在这里通知。