在过去的两天里,我的团队负责解决从Elasticsearch DB(ES)查询数据的问题。我们的目的是通过ES中的字段获取聚合数据,并累积两个值。 如果我将它翻译成SQL查询,我们需要这样的东西:
SELECT MAX(FIELD1) AS F1, MAX(FIELD2) AS F2 FROM ES GROUP BY FIELD3 HAVING F1 = ‘SOME_TEXT’
请注意F1是文本字段。
我们现在发现的唯一解决方案是:
{
"size": 0 ,
"aggs": {
"flowId": {
"terms": {
"field": "flowId.keyword"
},
"aggs" :{
"scenario" : { "terms" : { "field" : "scnName.keyword" } },
"max_time" : { "max" : { "field" : "inFlowTimeNsec" } },
"sales_bucket_filter": {
"bucket_selector": {
"buckets_path": {
"totalSales": "scenario"
},
"script": "params.totalSales != null && params.totalSales == 'Test' "
}
}
}
}
}
}
我们遇到的问题是:
{
"error": {
"root_cause": [],
"type": "search_phase_execution_exception",
"reason": "",
"phase": "fetch",
"grouped": true,
"failed_shards": [],
"caused_by": {
"type": "aggregation_execution_exception",
"reason": "buckets_path must reference either a number value or a single value numeric metric aggregation, got: org.elasticsearch.search.aggregations.bucket.terms.StringTerms"
}
},
"status": 503
}
据我所知,该问题已经提出:https://github.com/elastic/elasticsearch/issues/23874
没有bucket_selector部分的上述查询的输出如下所示:
{
"took": 52,
"timed_out": false,
"_shards": {
"total": 480,
"successful": 480,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 15657901,
"max_score": 0,
"hits": []
},
"aggregations": {
"flowId": {
"doc_count_error_upper_bound": 4104,
"sum_other_doc_count": 9829317,
"buckets": [
{
"key": "0_66718_31120bfd_39ae_4258_81e8_08abd89a81bf",
"doc_count": 107816,
"scenario": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "GetPop",
"doc_count": 12
}
]
},
"max_time": {
"value": 121244876800
}
},
{
"key": "0_67116_31120bfd_39ae_4258_81e8_08abd89a81bf",
"doc_count": 107752,
"scenario": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "GetPop",
"doc_count": 12
}
]
},
"max_time": {
"value": 120955101184
}
},
…
}
问题是还有其他方法可以实现我们的需求吗?我的意思是我们需要过滤聚合数据的结果......
非常感谢, EG