我有一个包含产品的elasticsearch索引,我试图创建一个具有文本字段功能的搜索列表产品。
数据集的排序示例
{"name": "foo", "count": 10}
{"name": "bar", "count": 5}
{"name": "foo bar"}
{"name": "foo baz", "count": 20}
一开始,我是在要求。
GET /product
/_search
{
"query": {
"match": {"name": "foo"}
}
}
效果很好,但现在我想增加某些产品的重量(count
字段)
我正在使用此查询
GET /product/_search
{
"query": {
"function_score": {
"query": {
"match": {"name": "foo bar"}
},
"field_value_factor": {
"field": "count",
"missing": 0
}
}
}
}
但是通过此查询,我首先有foo
,然后是bar
,然后是foo bar
,看来名称匹配的重要性不如计数重要,我想拥有{{1} },然后foo bar
和foo
但是要寻找bar
,我想要foo
,foo baz
和foo
答案 0 :(得分:1)
但是要查找foo,我想要foo baz,foo和foo bar
添加包含索引数据,搜索查询和搜索结果的工作示例
请参阅function score query以获取详细说明。
索引数据:
{"name": "foo", "count": 10}
{"name": "bar", "count": 5}
{"name": "foo bar"}
{"name": "foo baz", "count": 20}
搜索查询:
但是要查找foo,我想要foo baz,foo和foo bar
{
"query": {
"function_score": {
"query": {
"bool": {
"should": [
{
"match": {
"name": {
"query": "foo"
}
}
}
]
}
},
"functions": [
{
"field_value_factor": {
"field": "count",
"factor": 1.0,
"missing": 0
}
}
],
"boost_mode": "multiply"
}
}
}
搜索结果:
"hits": [
{
"_index": "stof_64169215",
"_type": "_doc",
"_id": "4",
"_score": 6.2774796,
"_source": {
"name": "foo baz",
"count": 20
}
},
{
"_index": "stof_64169215",
"_type": "_doc",
"_id": "1",
"_score": 4.1299205,
"_source": {
"name": "foo",
"count": 10
}
},
{
"_index": "stof_64169215",
"_type": "_doc",
"_id": "3",
"_score": 0.0,
"_source": {
"name": "foo bar"
}
}
]
更新1:
我想要foo bar,然后是foo和bar
搜索查询:
{
"query": {
"function_score": {
"query": {
"bool": {
"should": [
{
"match": {
"name": {
"query": "foo bar"
}
}
}
]
}
},
"functions": [
{
"field_value_factor": {
"field": "count",
"factor": 1.0,
"missing": 0,
"modifier": "sqrt"
}
}
],
"boost_mode": "sum"
}
}
}
说明API结果:
要了解上述搜索查询,您需要了解如何为查询计算得分。
"name": "foo bar"
进行搜索,理想情况下应返回foo bar
,然后依次返回foo
和bar
。通过对foo bar
进行常规匹配查询(并且不进行功能得分查询),您将获得结果。count
字段上增加权重,该字段使您可以修改查询检索的文档分数。 factor-与字段值相乘的可选因子,默认为 1
修饰符-应用于字段值的修饰符
缺少-如果文档没有该字段,则使用该值。
将生成以下得分公式:
sqrt(1.0 * doc ['count']。value)
现在,对于包含foo bar
的文档,没有count
字段,因此将使用缺失值(在查询中定义,即9
)。其得分将为sqrt(1.0 * 9) = 3.0
。
如果您遗漏任何小于9的值,那么结果的顺序将改变。由于count
字段的得分会有所不同(当您将缺失值设为0
时,则foo bar
仅根据match
查询获得得分,而没有得分从field_value_factor添加)。最终分数是根据match
查询+ field_value_factor
(在count
字段)计算得出的。因此,foo bar
的总得分将低于其他文档。
例如:对于foo bar
,最终得分将计算为0.78038335+3.0=3.7803833
。请仔细阅读下面的结果,以详细了解如何计算得分。
请浏览此博客以了解how scoring works in elasticsearch
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": 3.7803833,
"hits": [
{
"_shard": "[stof_64169215][0]",
"_node": "fVeabsK0Q1GnCZ_8oROXjA",
"_index": "stof_64169215",
"_type": "_doc",
"_id": "3",
"_score": 3.7803833,
"_source": {
"name": "foo bar"
},
"_explanation": {
"value": 3.7803833,
"description": "sum of",
"details": [
{
"value": 0.78038335,
"description": "sum of:",
"details": [
{
"value": 0.39019167,
"description": "weight(name:foo in 0) [PerFieldSimilarity], result of:",
"details": [
{
"value": 0.39019167,
"description": "score(freq=1.0), computed as boost * idf * tf from:",
"details": [
{
"value": 2.2,
"description": "boost",
"details": []
},
{
"value": 0.47000363,
"description": "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
"details": [
{
"value": 2,
"description": "n, number of documents containing term",
"details": []
},
{
"value": 3,
"description": "N, total number of documents with field",
"details": []
}
]
},
{
"value": 0.37735844,
"description": "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
"details": [
{
"value": 1.0,
"description": "freq, occurrences of term within document",
"details": []
},
{
"value": 1.2,
"description": "k1, term saturation parameter",
"details": []
},
{
"value": 0.75,
"description": "b, length normalization parameter",
"details": []
},
{
"value": 2.0,
"description": "dl, length of field",
"details": []
},
{
"value": 1.3333334,
"description": "avgdl, average length of field",
"details": []
}
]
}
]
}
]
},
{
"value": 0.39019167,
"description": "weight(name:bar in 0) [PerFieldSimilarity], result of:",
"details": [
{
"value": 0.39019167,
"description": "score(freq=1.0), computed as boost * idf * tf from:",
"details": [
{
"value": 2.2,
"description": "boost",
"details": []
},
{
"value": 0.47000363,
"description": "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
"details": [
{
"value": 2,
"description": "n, number of documents containing term",
"details": []
},
{
"value": 3,
"description": "N, total number of documents with field",
"details": []
}
]
},
{
"value": 0.37735844,
"description": "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
"details": [
{
"value": 1.0,
"description": "freq, occurrences of term within document",
"details": []
},
{
"value": 1.2,
"description": "k1, term saturation parameter",
"details": []
},
{
"value": 0.75,
"description": "b, length normalization parameter",
"details": []
},
{
"value": 2.0,
"description": "dl, length of field",
"details": []
},
{
"value": 1.3333334,
"description": "avgdl, average length of field",
"details": []
}
]
}
]
}
]
}
]
},
{
"value": 3.0,
"description": "min of:",
"details": [
{
"value": 3.0,
"description": "field value function: sqrt(doc['count'].value?:9.0 * factor=1.0)",
"details": []
},
{
"value": 3.4028235E38,
"description": "maxBoost",
"details": []
}
]
}
]
}
},
{
"_shard": "[stof_64169215][0]",
"_node": "fVeabsK0Q1GnCZ_8oROXjA",
"_index": "stof_64169215",
"_type": "_doc",
"_id": "1",
"_score": 3.685826,
"_source": {
"name": "foo",
"count": 10
},
"_explanation": {
"value": 3.685826,
"description": "sum of",
"details": [
{
"value": 0.52354836,
"description": "sum of:",
"details": [
{
"value": 0.52354836,
"description": "weight(name:foo in 0) [PerFieldSimilarity], result of:",
"details": [
{
"value": 0.52354836,
"description": "score(freq=1.0), computed as boost * idf * tf from:",
"details": [
{
"value": 2.2,
"description": "boost",
"details": []
},
{
"value": 0.47000363,
"description": "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
"details": [
{
"value": 2,
"description": "n, number of documents containing term",
"details": []
},
{
"value": 3,
"description": "N, total number of documents with field",
"details": []
}
]
},
{
"value": 0.50632906,
"description": "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
"details": [
{
"value": 1.0,
"description": "freq, occurrences of term within document",
"details": []
},
{
"value": 1.2,
"description": "k1, term saturation parameter",
"details": []
},
{
"value": 0.75,
"description": "b, length normalization parameter",
"details": []
},
{
"value": 1.0,
"description": "dl, length of field",
"details": []
},
{
"value": 1.3333334,
"description": "avgdl, average length of field",
"details": []
}
]
}
]
}
]
}
]
},
{
"value": 3.1622777,
"description": "min of:",
"details": [
{
"value": 3.1622777,
"description": "field value function: sqrt(doc['count'].value?:9.0 * factor=1.0)",
"details": []
},
{
"value": 3.4028235E38,
"description": "maxBoost",
"details": []
}
]
}
]
}
},
{
"_shard": "[stof_64169215][0]",
"_node": "fVeabsK0Q1GnCZ_8oROXjA",
"_index": "stof_64169215",
"_type": "_doc",
"_id": "2",
"_score": 2.7596164,
"_source": {
"name": "bar",
"count": 5
},
"_explanation": {
"value": 2.7596164,
"description": "sum of",
"details": [
{
"value": 0.52354836,
"description": "sum of:",
"details": [
{
"value": 0.52354836,
"description": "weight(name:bar in 0) [PerFieldSimilarity], result of:",
"details": [
{
"value": 0.52354836,
"description": "score(freq=1.0), computed as boost * idf * tf from:",
"details": [
{
"value": 2.2,
"description": "boost",
"details": []
},
{
"value": 0.47000363,
"description": "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
"details": [
{
"value": 2,
"description": "n, number of documents containing term",
"details": []
},
{
"value": 3,
"description": "N, total number of documents with field",
"details": []
}
]
},
{
"value": 0.50632906,
"description": "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
"details": [
{
"value": 1.0,
"description": "freq, occurrences of term within document",
"details": []
},
{
"value": 1.2,
"description": "k1, term saturation parameter",
"details": []
},
{
"value": 0.75,
"description": "b, length normalization parameter",
"details": []
},
{
"value": 1.0,
"description": "dl, length of field",
"details": []
},
{
"value": 1.3333334,
"description": "avgdl, average length of field",
"details": []
}
]
}
]
}
]
}
]
},
{
"value": 2.236068,
"description": "min of:",
"details": [
{
"value": 2.236068,
"description": "field value function: sqrt(doc['count'].value?:9.0 * factor=1.0)",
"details": []
},
{
"value": 3.4028235E38,
"description": "maxBoost",
"details": []
}
]
}
]
}
}
]
}
}
搜索结果:
"hits": [
{
"_index": "stof_64169215",
"_type": "_doc",
"_id": "3",
"_score": 3.7803833,
"_source": {
"name": "foo bar"
}
},
{
"_index": "stof_64169215",
"_type": "_doc",
"_id": "1",
"_score": 3.685826,
"_source": {
"name": "foo",
"count": 10
}
},
{
"_index": "stof_64169215",
"_type": "_doc",
"_id": "2",
"_score": 2.7596164,
"_source": {
"name": "bar",
"count": 5
}
}
]
答案 1 :(得分:0)
将此添加到您的搜索请求中:
"sort": [
{
"name.keyword": {
"order": "desc"
}
},
"_score"
],
您的完整搜索如下:
GET product/_search
{
"sort": [
{
"name.keyword": {
"order": "desc"
}
},
"_score"
],
"query": {
"function_score": {
"query": {
"match": {"name": "foo bar"}
},
"field_value_factor": {
"field": "count",
"missing": 0
}
}
}
}