问题陈述 我有嵌套文档,必须在嵌套文档上应用一些过滤器,并想根据满足该过滤器的嵌套文档的数量对父文档进行排序。
//示例问题
我们需要在单个查询中应用过滤器和排序,具体顺序如下:
样本数据:
[{
"title": "Investment secrets",
"body": "What they don't tell you ...",
"tags": [ "shares", "equities" ],
"comments": [
{
"id": 12313,
"city": "Pune",
"name": "Mary Brown 1",
"comment": "Lies, lies, lies ",
"date": "2018-10-18"
},
{
"id": 12314,
"city": "Pune",
"name": "Mary Brown 1",
"comment": "Lies, lies, lies ",
"date": "2018-10-20"
}
]
},
{
"title": "Investment secrets",
"body": "What they don't tell you ...",
"tags": [ "shares", "equities" ],
"comments": [
{
"id": 12315,
"city": "Pune",
"name": "Mary Brown ",
"comment": "Lies, lies, lies ",
"date": "2018-10-18"
},
{
"id": 12316,
"city": "Bangalore",
"name": "Mary Brown ",
"comment": "Lies, lies, lies ",
"date": "2018-10-20"
}
]
}]
到目前为止尝试过的解决方案:
使用脚本字段
{
"query":{
"nested":{
"path":"comments",
"inner_hits":{
},
"query":{
"term":{
"comments.city":{
"value":"Pune"
}
}
}
}
},
"sort":{
"_script":{
"script":"params._source.inner_hits.comments.hits.total",
"type":"number",
"order":"desc"
}
}}
此解决方案的问题=> inner_hits在这里无法访问
我们也尝试过score_mode:sum,但是似乎已经具有数字值的字段可以更正确地工作,否则会进行排序,但我们无法确定它的相关性。而且我们当前的数据结构除了ID之外没有其他任何数字字段。(因此显然不能在这里使用score_mode作为总和)。
{
"query":{
"nested":{
"path":"comments",
"query":{
"bool":{
"must":[
{
"term":{
"comments.numericField":{
"value":"numericValue"
}
}
}
]
}
},
"score_mode":"sum",
"inner_hits":{
}
}
}
}
}
}
我们考虑过使用自定义function_score,但尚未尝试过,因为如果有很多过滤器,决定排序优先级可能很复杂。
因此,我们正在考虑不使用分数来解决此问题。