例如:
我有很多这样的文件:
email status
1@123.com open
1@123.com click
2@123.com open
3@123.com open
我将查询具有唯一状态值的所有文件:“打开”,由于记录“1@123.com”包含“点击”状态,所以“1 @123.com”不要指望!
我在下面尝试了这个,但不是我的期望:
{
"aggs": {
"hard_bounce_count": {
"filter": {
"term": {
"actionStatus": "open"
}
},
"aggs": {
"email_count": {
"value_count": {
"field": "email"
}
}
}
我期待这样的反应:
2@123.com open
3@123.com open
我怎么能这样做,谢谢......
答案 0 :(得分:0)
此处,外部术语aggs(名为 EMAIL_LIST )会返回所有电子邮件,然后在每个电子邮件存储桶中首先查找状态是否已打开(使用名称为的过滤器aggs)打开)然后它会查找状态是否为“打开”(使用名为 OTHER_THAN_OPEN 的其他过滤器aggs)
{
"size": 0,
"aggs": {
"EMAIL_LIST": {
"terms": {
"field": "email.keyword"
},
"aggs": {
"OPEN": {
"filter": {
"bool": {
"must": [
{
"term": {
"status": "open"
}
}
]
}
}
},
"OTHER_THAN_OPEN": {
"filter": {
"bool": {
"must_not": [
{
"term": {
"status": "open"
}
}
]
}
}
},
"SELECTION_SCRIPT": {
"bucket_selector": {
"buckets_path": {
"open_count": "OPEN._count",
"other_than_open_count": "OTHER_THAN_OPEN._count"
},
"script": "params.other_than_open_count==0 && params.open_count>0"
}
}
}
}
}
}
在“bucket_selector”聚合之上,只选择那些只有打开状态的存储桶
"aggregations": {
"EMAIL_LIST": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "2@123.com",
"doc_count": 1,
"OTHER_THAN_OPEN": {
"doc_count": 0
},
"OPEN": {
"doc_count": 1
}
},
{
"key": "3@123.com",
"doc_count": 1,
"OTHER_THAN_OPEN": {
"doc_count": 0
},
"OPEN": {
"doc_count": 1
}
}
]
}
}
所以最终答案将是电子邮件“2@123.com”和“3@123.com”
答案 1 :(得分:0)
我也可以查询。
{
"aggs": {
"email": {
"terms": {
"field": "email"
},
"aggs": {
"status_group": {
"terms": {
"field": "status"
}
}
}
}
}
}
响应:
"aggregations": {
"email": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [{
"key": "1@123.com",
"doc_count": 2,
"status_group": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [{
"key": "click",
"doc_count": 1
}, {
"key": "open",
"doc_count": 1
}
]
}
}, {
"key": "2@123.com",
"doc_count": 1,
"status_group": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [{
"key": "open",
"doc_count": 1
}
]
}
}, {
"key": "3@123.com",
"doc_count": 1,
"status_group": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [{
"key": "open",
"doc_count": 1
}
]
}
}
]
}
}
但我怎样才能排除" 1 @ email"在结果桶中,因为我最终需要所有符合条件的文件的统计数据