我需要在ElasticSearch中计算管道聚合,但我不知道该如何表达。
每个文档都有一个电子邮件地址和一定数量。我需要输出金额计数的范围段,按唯一的电子邮件分组。
{ "0 - 99": 300, "100 - 400": 100 ...}
基本上将是预期的输出(密钥将在我的应用程序代码中转换),表明300份唯一的电子邮件已在所有文档中累计收到至少99(金额)。
直觉上,我希望查询如下。但是,范围似乎不是存储桶聚合(或允许buckets_path)。
这里正确的方法是什么?
{
aggs: {
users: {
terms: {
field: "email"
},
aggs: {
amount_received: {
sum: {
field: "amount"
}
}
}
},
amount_ranges: {
range: {
buckets_path: "users>amount_received",
ranges: [
{ to: 99.0 },
{ from: 100.0, to: 299.0 },
{ from: 300.0, to: 599.0 },
{ from: 600.0 }
]
}
}
}
}
答案 0 :(得分:4)
没有直接进行此操作的管道聚合。但是,我想我想出了一个适合您需求的解决方案,它就像这样。想法是重复相同的terms/sum
聚合,然后对您感兴趣的每个范围使用bucket_selector
管道聚合。
POST index/_search
{
"size": 0,
"aggs": {
"users_99": {
"terms": {
"field": "email",
"size": 1000
},
"aggs": {
"amount_received": {
"sum": {
"field": "amount"
}
},
"-99": {
"bucket_selector": {
"buckets_path": {
"amountReceived": "amount_received"
},
"script": "params.amountReceived < 100"
}
}
}
},
"users_100_299": {
"terms": {
"field": "email",
"size": 1000
},
"aggs": {
"amount_received": {
"sum": {
"field": "amount"
}
},
"100-299": {
"bucket_selector": {
"buckets_path": {
"amountReceived": "amount_received"
},
"script": "params.amountReceived >= 100 && params.amountReceived < 300"
}
}
}
},
"users_300_599": {
"terms": {
"field": "email",
"size": 1000
},
"aggs": {
"amount_received": {
"sum": {
"field": "amount"
}
},
"300-599": {
"bucket_selector": {
"buckets_path": {
"amountReceived": "amount_received"
},
"script": "params.amountReceived >= 300 && params.amountReceived < 600"
}
}
}
},
"users_600": {
"terms": {
"field": "email",
"size": 1000
},
"aggs": {
"amount_received": {
"sum": {
"field": "amount"
}
},
"600": {
"bucket_selector": {
"buckets_path": {
"amountReceived": "amount_received"
},
"script": "params.amountReceived >= 600"
}
}
}
}
}
}
结果中,users_99
中的存储桶数将是数量少于99的唯一电子邮件数量。类似地,users_100_299
将包含与唯一电子邮件数量一样多的存储桶金额在100到300之间。依此类推...