是否可以使用脚本对术语聚合结果进行排序?
具体来说,我想基于两个聚合字段的划分(非加性点击率)对结果进行排序。
到目前为止,我找到了几种解决方案,我所寻找的也没有:
"aggs": {
"myagg": {
"aggs": {
/* this CTR is incorrect */
"calculated_ctr" : { "sum" : { "script" : "try { doc['clicks'].value / doc['impressions'].value } catch(Exception ignored) { return 0; } " } }
},
"terms": {
"field": "publisher_id",
"order": {
"calculated_ctr": "desc"
}
}
}
上面对计算字段排序很好,但点击率不正确。它是在每个文档级别计算然后求和,问题是我们在这里总结非加性比率。
第二种方法:
"aggs": {
"myagg": {
"aggs": {
"calculated_ctr" : {
"scripted_metric": {
"init_script" : "params._agg.imps = []; params._agg.clicks = []",
"map_script" : "params._agg.imps.add(doc.impressions.value); params._agg.clicks.add(doc.clicks.value);",
"combine_script" : "double impressions, clicks = 0; for (i in params._agg.imps) { impressions += i } for (c in params._agg.clicks) { clicks += c } return [impressions, clicks]",
"reduce_script" : "double impressions, clicks = 0; for (a in params._aggs) { impressions += a[0]; clicks += a[1]; } return clicks / impressions"
}
}
},
"terms": {
"field": "publisher_id",
"order": {
"calculated_ctr": "desc" /* throws an error */
}
}
由于我们获得了正确的点击率,因此上述情况稍好一些。但是,事实证明,scripted metric
上的排序是不可能的:
https://github.com/elastic/elasticsearch/issues/8486