请参阅下面的搜索查询,并进一步了解具体问题。
search = {
'query' : {
'function_score': {
'score_mode': 'multiply'
'functions': functions,
'query': {
'match_all':{}
},
'filter': {
'bool': {
'must': filters_include,
'must_not': filters_exclude
}
}
}
}
'sort': [{'_score': {'order': 'desc'}},
{'time': {'order': 'desc'}}]
}
functions
的样子:
[{'weight': 5.0, 'gauss': {'time': {'scale': '7d'}}},
{'weight': 3.0, 'script_score': {'script': "1+doc['scores.year'].value"}},
{'weight': 2.0, 'script_score': {'script': "1+doc['scores.month'].value"}}]
运行此查询时发生了什么?是否通过function_score对文档进行评分,然后使用sort
数组对事件进行排序?什么是_score
现在(注意查询是match_all
)并且它在排序中做了什么?如果我将其撤消并将time
放在_score
之前排序,我应该期待什么结果?
答案 0 :(得分:3)
match_all
会在没有function_score
的情况下给出相同的分数,这意味着每个文档都会获得1
。
使用function_score
,它将计算所有三个分数(所有三个匹配,因为您没有每个函数的过滤器)并且它将乘以它们(因为您有score_mode: multiply
)。所以,大致你得到function1_score * function2_score * function3_score
最终得分。得到的分数将用于排序。如果某些_scores相等,则time
用于排序。
如果你从你的应用程序中取出你的查询,那将是最好的,但是在Marvel的Sense仪表板中以JSON为例,并用?explain
进行测试。它会为每个分数计算提供详细的解释。
让我举个例子:让我们说我们有一个包含"year":2015,"month":7,"time":"2015-07-06"
的文件。
使用_search?explain
运行查询会给出非常详细的解释:
"hits": [
{
"_shard": 4,
"_node": "jt4AX7imTECLWH4Bofbk3g",
"_index": "test",
"_type": "test",
"_id": "3",
"_score": 26691.023,
"_source": {
"text": "whatever",
"year": 2015,
"month": 7,
"time": "2015-07-06"
},
"sort": [
26691.023,
1436140800000
],
"_explanation": {
"value": 26691.023,
"description": "function score, product of:",
"details": [
{
"value": 1,
"description": "ConstantScore(BooleanFilter(+cache(year:[1990 TO *]) -cache(month:[13 TO *]))), product of:",
"details": [
{
"value": 1,
"description": "boost"
},
{
"value": 1,
"description": "queryNorm"
}
]
},
{
"value": 26691.023,
"description": "Math.min of",
"details": [
{
"value": 26691.023,
"description": "function score, score mode [multiply]",
"details": [
{
"value": 0.2758249,
"description": "function score, product of:",
"details": [
{
"value": 1,
"description": "match filter: *:*"
},
{
"value": 0.2758249,
"description": "product of:",
"details": [
{
"value": 0.055164978,
"description": "Function for field time:",
"details": [
{
"value": 0.055164978,
"description": "exp(-0.5*pow(MIN[Math.max(Math.abs(1.4361408E12(=doc value) - 1.437377331833E12(=origin))) - 0.0(=offset), 0)],2.0)/2.63856688924644672E17)"
}
]
},
{
"value": 5,
"description": "weight"
}
]
}
]
},
{
"value": 6048,
"description": "function score, product of:",
"details": [
{
"value": 1,
"description": "match filter: *:*"
},
{
"value": 6048,
"description": "product of:",
"details": [
{
"value": 2016,
"description": "script score function, computed with script:\"1+doc['year'].value",
"details": [
{
"value": 1,
"description": "_score: ",
"details": [
{
"value": 1,
"description": "ConstantScore(BooleanFilter(+cache(year:[1990 TO *]) -cache(month:[13 TO *]))), product of:",
"details": [
{
"value": 1,
"description": "boost"
},
{
"value": 1,
"description": "queryNorm"
}
]
}
]
}
]
},
{
"value": 3,
"description": "weight"
}
]
}
]
},
{
"value": 16,
"description": "function score, product of:",
"details": [
{
"value": 1,
"description": "match filter: *:*"
},
{
"value": 16,
"description": "product of:",
"details": [
{
"value": 8,
"description": "script score function, computed with script:\"1+doc['month'].value",
"details": [
{
"value": 1,
"description": "_score: ",
"details": [
{
"value": 1,
"description": "ConstantScore(BooleanFilter(+cache(year:[1990 TO *]) -cache(month:[13 TO *]))), product of:",
"details": [
{
"value": 1,
"description": "boost"
},
{
"value": 1,
"description": "queryNorm"
}
]
}
]
}
]
},
{
"value": 2,
"description": "weight"
}
]
}
]
}
]
},
{
"value": 3.4028235e+38,
"description": "maxBoost"
}
]
},
{
"value": 1,
"description": "queryBoost"
}
]
}
}
因此,对于gauss
,计算得分为0.055164978。我不知道这对你的问题有多重要,但让我们假设计算是正确的:-)。您的gauss
函数weight
为5,因此得分为5 * 0.055164978 = 0.27582489。
对于script
year
函数,我们有(1 + 2015)* 3 = 6048。
对于script
month
函数,我们有(1 + 7)* 2 = 16。
该文件总分为multiply
为0.27582489 * 6048 * 16 = 26691.023
每个文档还有一个部分,显示用于排序的值。在本文档的案例中:
"sort": [
26691.023,
1436140800000
]
第一个数字是如图所示计算的_score
,第二个数字是日期2015-07-06
的毫秒表示。