我正在尝试通过聚合嵌套列表来获取加权平均值。 每个文档都包含一个学生的详细信息,并且每个学生的主题有所不同,并且每个主题的权重都不同。
我正在尝试按主题计算加权平均数。
我的文档的格式-
[{'class': '10th',
'id': '1',
'subject': [{'marks': 60, 'name': 's1', 'weight': 30},
{'marks': 80, 'name': 's2', 'weight': 70}]},
{'class': '11th',
'id': '2',
'subject': [{'marks': 43, 'name': 's10', 'weight': 40},
{'marks': 54, 'name': 's20', 'weight': 60}]},
{'class': '10th',
'id': '3',
'subject': [{'marks': 43, 'name': 's1', 'weight': 20},
{'marks': 54, 'name': 's20', 'weight': 80}]},
{'class': '10th',
'id': '4',
'subject': [{'marks': 69, 'name': 's10', 'weight': 30},
{'marks': 45, 'name': 's2', 'weight': 70}]}]
这里s1,s10,s2,s20是主题。对于给定的班级,说“第十”,我正在尝试汇总加权平均值。
我进行的查询是
GET students_try/_search
{
"query": {
"match": {
"class": "10th"
}
},
"aggs": {
"subjects": {
"nested": {
"path": "subject"
},
"aggs": {
"subjects": {
"terms": {
"field": "subject.name"
},
"aggs": {
"avg_score": {
"avg": {
"field": "subject.marks"
}
},
"weighted_grade": {
"weighted_avg": {
"value": {
"field": "subject.marks"
},
"weight": {
"field": "subject.weight"
}
}
}
}
}
}
}
},
"size": 0
}
我得到的错误是-
{u'error': {u'col': 211,
u'line': 1,
u'reason': u'Unknown BaseAggregationBuilder [weighted_avg]',
u'root_cause': [{u'col': 211,
u'line': 1,
u'reason': u'Unknown BaseAggregationBuilder [weighted_avg]',
u'type': u'unknown_named_object_exception'}],
u'type': u'unknown_named_object_exception'},
u'status': 400}
我不确定是什么引起了错误。
答案 0 :(得分:1)
是的,Nishant提到的加权平均值仅出现在此link的6.4版详细发布的A few others
部分中提到的6.4之后
但是我使用Bucket Script Aggregation提出了以下脚本,该脚本计算了每个存储桶的加权平均值:
POST <your_index_name>/_search
{
"size": 0,
"query": {
"match": {
"class": "10th"
}
},
"aggs": {
"subjects": {
"nested": {
"path": "subject"
},
"aggs": {
"subjects": {
"terms": {
"field": "subject.name.keyword"
},
"aggs": {
"avg_score": {
"avg": {
"field": "subject.marks"
}
},
"sum_score":{
"sum_productOfMarksAndWeight": {
"script": "doc['subject.marks'].value * doc['subject.weight'].value"
}
},
"sum_weights": {
"sum": {
"field": "subject.weight"
}
},
"weighted_avg":{
"bucket_script": {
"buckets_path": {
"sumScore": "sum_productOfMarksAndWeight",
"sumWeights": "sum_weights"
},
"script": "params.sumScore/params.sumWeights"
}
}
}
}
}
}
}
}
如果您仔细查看上述汇总,则对于每个存储区,我都使用Sum Aggregation计算了sum of weights
和sum of product of weights and marks
,然后使用了这两个汇总来计算加权聚合。
以下是您的回复显示方式。请注意,在汇总结果中还会看到sum of weights
和sum of product of weights and marks
。
{
"took": 12,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 0,
"hits": []
},
"aggregations": {
"subjects": {
"doc_count": 6,
"subjects": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "s1",
"doc_count": 2,
"sum_weights": {
"value": 50
},
"sum_productOfMarksAndWeight": {
"value": 2660
},
"avg_score": {
"value": 51.5
},
"weighted_avg": {
"value": 53.2
}
},
{
"key": "s2",
"doc_count": 2,
"sum_weights": {
"value": 140
},
"sum_productOfMarksAndWeight": {
"value": 8750
},
"avg_score": {
"value": 62.5
},
"weighted_avg": {
"value": 62.5
}
},
{
"key": "s10",
"doc_count": 1,
"sum_weights": {
"value": 30
},
"sum_productOfMarksAndWeight": {
"value": 2070
},
"avg_score": {
"value": 69
},
"weighted_avg": {
"value": 69
}
},
{
"key": "s20",
"doc_count": 1,
"sum_weights": {
"value": 80
},
"sum_productOfMarksAndWeight": {
"value": 4320
},
"avg_score": {
"value": 54
},
"weighted_avg": {
"value": 54
}
}
]
}
}
}
}
我希望这会有所帮助,如果没有,请告诉我,如果您认为这可以解决您的需求,请继续接受此答案;-)