Question

试图了解排名的工作原理。我的索引在所有字段中都使用“英语”分析器定义。

这是我的疑问：

GET test_index_1/study/_search/
{ 
 "query": {

    "multi_match" : {
      "query": "stupid question", 
      "type": "best_fields",
      "fields": ["description", "title",   "questions.text" ]

    }
  }

}

以下是返回的结果。我在测试索引中只有3个文档。

我想知道为什么第一份文件的得分是第二份的两倍。

直观地说，“标题”和“描述”字段是“相等的”：为什么“标题”中的匹配得分更高？

"hits": {
"total": 3,
"max_score": 1.7600523,
"hits": [
  {
    "_index": "test_index_1",
    "_type": "study",
    "_id": "AV28gnhD1DC3_uN8bTrd",
    "_score": 1.7600523,
    "_source": {
      "title": "stupid question",
      "description": "test test",
      "questions": [
        {
          "text": "stupid text"
        }
      ]
    }
  },
  {
    "_index": "test_index_1",
    "_type": "study",
    "_id": "AV28gomD1DC3_uN8bTre",
    "_score": 0.84339964,
    "_source": {
      "title": "test test",
      "description": "stupid question",
      "questions": [
        {
          "text": "stupid text"
        }
      ]
    }
  },
  {
    "_index": "test_index_1",
    "_type": "study",
    "_id": "AV28gpPT1DC3_uN8bTrf",
    "_score": 0.84339964,
    "_source": {
      "title": "test test",
      "description": "stupid question",
      "questions": [
        {
          "text": "no text"
        }
      ]
    }
  }
]

提前感谢任何提示。

Answer 1

Elasticsearch使用倒排索引和tfidf。因此，更重要的是在所有文档中出现较少的单词。言语＆＃34;愚蠢＆＃34;和＆＃34;问题＆＃34;在所有标题中仅出现一次（仅在第一个结果中），但在所有描述中出现两次（在第二和第三个结果中），所以＆＃34;愚蠢的问题＆＃34;标题中更有价值，因为它发生的更少。这就是为什么第一份文件中得分更大的原因。

“最佳字段”查询的ElasticSearch得分无法按预期工作

1 个答案: