我的ElasticSearch有一堆索引的电视剧集。
每集都标有一系列描述内容主要特征的标签。
现在我想实现一个“类似”的功能,我想搜索给定剧集中标签最多重叠(但不一定全部)的所有剧集。
示例:
Original Episode Tags: ["a","b","c","d"]
Some Other Episode 1: ["a","b"] // should match, 2 matching tags
Some Other Episode 2: ["a","b","c","x","y"] // should match higher, 3 matching tags
Some Other Episode 2: ["a"] // should match lower, only 1 matching tags
Some Other Episode 3: ["e","f","g"] // shouldn't match, no matching tags
我尝试使用带有should
子句的布尔查询,但问题是,一旦达到minimum_should_match
要求,文档就会匹配,其余的子句似乎会从得分计算中忽略。
答案 0 :(得分:0)
我想我找到了(a)正确的方法:
{
"query": {
"function_score": {
"query": {
"bool": {
"should": [
{"term":{"tags":"a"}},
{"term":{"tags":"b"}},
{"term":{"tags":"c"}}
]
}
},
"functions": [
{"filter":{"term":{"tags":"a"}},"weight": 5},
{"filter":{"term":{"tags":"b"}},"weight": 5},
{"filter":{"term":{"tags":"c"}},"weight": 5}
]
}
}
}
should
子句确保匹配文档中至少有一个标记匹配,而functions
子句为每个匹配标记将匹配文档的得分提高5。