假设我有这样的文件-
bufferCount
现在这里的议程是-
根据最大相似度属性{
"_id": 1,
"threat": {
"application_number": 1234,
}
"score_algorithms": [
{
"score": 21,
},
{
"score": 93,
}
],
"max_similarity": 93,
}
{
"_id": 2,
"threat": {
"application_number": 1348,
}
"score_algorithms": [
{
"score": 45,
},
{
"score": 67,
}
],
"max_similarity": 67,
}
{
"_id": 3,
"threat": {
"application_number": 1234,
}
"score_algorithms": [
{
"score": 98,
},
{
"score": 51,
}
],
"max_similarity": 98,
}
然后,根据max_similarity
threat.application_number
为1234(最大值为threat.application_number
)的所有文档的分组。第二个条目是max_similarity
为1348,以此类推的所有文档的分组。threat.application_number
值。答案 0 :(得分:1)
对于需求1和2。即,将文档分组和排序,可以在聚合定义中使用order
参数。
要检索聚合中的score_algorithms
字段,请使用top_hits
子聚合。
您最多只能检索size
聚合的top_hits
参数之前的文档。如果单个application_number
包含大量文档,则速度可能会很慢。
{
"size": 0,
"aggs" : {
"applications" : {
"terms" : {
"field" : "threat.application_number",
"order": [{"stats.max": "desc"}]
},
"aggs" : {
"stats" : { "stats" : { "field" : "max_similarity" } },
"applications_fields": {
"top_hits": {
"sort": [
{
"max_similarity": {
"order": "desc"
}
}
],
"_source": {
"includes": [ "score_algorithms", "max_similarity" ]
},
"size" : 100
}
}
}
}
}
}