看起来似乎没有简单的方法可以做到这一点......我怎样才能确保我的多重匹配查询中的某些字段实际上被正确提升,以便精确匹配显示在顶部?< / p> 老实说,我似乎已经尝试了很多方法,但也许有人知道答案......
在我的电影和音乐数据库中,我尝试一次搜索多个字段,但确保完全匹配使其位于顶部,并且某些字段(如标题和艺术家名称)会有更多提升。
这是我查询的主要部分......
"query": {
"bool": {
"should": [
{
"multi_match": {
"type": "phrase_prefix",
"query": "brave",
"max_expansions": 10,
"fields": [
"title^3",
"artists.name^2",
"starring.name^2",
"credits.name",
"tracks^0.1"
]
}
}
],
"minimum_number_should_match": 1
}
}
如您所见,查询是“勇敢的”。它恰好发生了一部名为勇敢的电影。完美,我希望它在顶部 - 因为它不仅是完全匹配,而且匹配在标题中。然而,有一首名为“勇敢”的流行歌曲。来自sara bareilles,最终在顶部。为什么呢?
我已经尝试过人,自定义和其他方式已知的每个分析仪,并且我已经尝试更改“类型”&#39;每个其他排列的参数(短语,best_fields,cross_fields,most_fields),它似乎并没有表现出我有效地试图推广“标题”的事实。和&#39; artists.name&#39;和&#39; starring.name&#39;和DEMOTE&#39;追踪&#39;。
有什么方法可以确保所有完全匹配显示在顶部(特别是在标题等),然后是扩展等?
任何建议都会有所帮助。
修改
目前正在使用的分析仪似乎比其他分析仪更好用的是我称之为“定制分析仪”的定制分析仪。它由一个小写字母&#39;组成。过滤和&#39;关键字&#39;只有tokenizer。
这里是一些示例文档,按照它们出现在结果中的顺序:
fields": {
"title": [
"Brave"
],
"credits.name": [
"Kelly MacDonald",
"Emma Thompson",
"Billy Connolly",
"Julie Walters",
"Kevin McKidd",
"Craig Ferguson",
"Robbie Coltrane"
],
"starring.name": [
"Emma Thompson",
"Julie Walters",
"Billy Connolly",
"Kevin Mckidd",
"Kelly Macdonald"
]
,
fields": {
"credits.name": [
"Hilary Weeks",
"Scott Wiley",
"Sarah Sample",
"Debra Fotheringham",
"Dustin Christensen",
"Russ Dixon"
],
"title": [
"Say Love"
],
"artists.name": [
"Hilary Weeks"
],
"tracks": [
"Say Love",
"Another Second Chance",
"It's A Good Day",
"Brave",
"I Found Me",
"Hero",
"Tell Me",
"Where I Am",
"Better Promises",
"Even When"
]
,
fields": {
"title": [
"Brave Little Toaster"
],
"credits.name": [
"Randy Bennett",
"Jim Jackman",
"Randy Cook",
"Judy Toll",
"Jon Lovitz",
"Tim Stack",
"Timothy E. Day",
"Thurl Ravenscroft",
"Deanna Oliver",
"Phil Hartman",
"Jonathon Benair",
"Joe Ranft"
],
"starring.name": [
"Jon Lovitz",
"Thurl Ravenscroft",
"Tim Stack",
"Timothy E. Day",
"Deanna Oliver"
]
},
"fields": {
"title": [
"Braveheart"
],
"credits.name": [
"Bernard Horsfall",
"Martin Dempsey",
"James Robinson",
"Robert Paterson",
"Alan Tall",
"Rupert Vansittart",
"Donal Gibson",
"Malcolm Tierney",
"Sandy Nelson",
"Sean Lawlor"
],
"starring.name": [
"Brendan Gleeson",
"Sophie Marceau",
"Mel Gibson",
"Patrick Mcgoohan",
"Catherine Mccormack"
]
}
也许有人知道为什么第二个冠军......(在这种情况下,不像我之前所说过的那样,但是)希拉里周 - 谁有一个叫做勇敢的赛道&#39; ...为什么它在冠军之前&#39; braveheart&#39;和勇敢的小烤面包机&#39;?
再次编辑
为了使情况进一步复杂化,如果我有一个等级&#39;字段是我文档的一部分?我发现使用脚本评分函数将其添加到我的_score字段非常困难...
"functions": [
{
"script_score": {
"script": "_score * 1/ doc['rank'].value"
}
}
]