很快:使用Elasticsearch,给定一个字段列表,如何将每个文档的平均丢失字段数作为聚合?
使用missing
聚合类型,我可以获得缺少给定字段的文档总数。所以使用以下数据:
"hits": [{
"name": "A name",
"nickname": "A nickname",
"bestfriend": "A friend",
"hobby": "An hobby"
},{
"name": "A name",
"hobby": "An hobby"
},{
"name": "A name",
"nickname": "A nickname",
"hobby": "An hobby"
},{
"name": "A name",
"bestfriend": "A friend"
}]
我可以运行以下查询:
{
"aggs": {
"name_missing": {
"missing": {"field": "name"}
},
"nickname_missing": {
"missing": {"field": "nickname"}
},
"hobby_missing": {
"missing": {"field": "hobby"}
},
"bestfriend_missing": {
"missing": {"field": "bestfriend"}
}
}
}
我得到以下聚合:
...
"aggregations": {
"name_missing": {
"doc_count": 0
},
"nickname_missing": {
"doc_count": 2
},
"hobby_missing": {
"doc_count": 1
},
"bestfriend_missing": {
"doc_count": 1
}
}
...
我现在需要的是获得每个文档的平均丢失字段数。我可以通过代码对结果进行数学计算:
missing
汇总doc_count
值但是如何从Elasticsearch获得与聚合相同的结果?
感谢您的回复/建议。
答案 0 :(得分:1)
这是一个丑陋的解决方案,但它可以解决问题。
GET missing/missing/_search
{
"size": 0,
"aggs": {
"result": {
"terms": {
"script": "'aaa'"
},
"aggs": {
"name_missing": {
"missing": {
"field": "name"
}
},
"nickname_missing": {
"missing": {
"field": "nickname"
}
},
"hobby_missing": {
"missing": {
"field": "hobby"
}
},
"bestfriend_missing": {
"missing": {
"field": "bestfriend"
}
},
"avg_missing": {
"bucket_script": {
"buckets_path": { // This is kind of defining variables. name_missing._count will take the doc_count of the name_missing aggregation and same for others(nickname_missing,hobby_missing,bestfriend_missing) as well. "count":"_count" will take doc_count of the documents on which aggregation is performed(total no. of Hits).
"name_missing": "name_missing._count",
"nickname_missing": "nickname_missing._count",
"hobby_missing": "hobby_missing._count",
"bestfriend_missing": "bestfriend_missing._count",
"count":"_count"
},
"script": "(name_missing+nickname_missing+hobby_missing+bestfriend_missing)/count" // Here we are adding all the missing values and dividing it by the total no. of Hits as you require.
}
}
}
}
}
}
我已经向您展示了如何操作,现在您可以按照自己的方式按摩参数并提取您想要的内容。