我有一个包含数百万个文档的索引。假设我的每个文档都有一些代码,我需要找到符合某些条件的代码列表。我发现这样做的唯一方法是使用大量的聚合,所以我创建了一个丑陋的查询,它完全符合我的要求:
POST my-index/_search
{
"query": {
"range": {
"timestamp": {
"gte": "2017-08-01T00:00:00.000",
"lt": "2017-08-08T00:00:00.000"
}
}
},
"size": 0,
"aggs": {
"codes": {
"terms": {
"field": "code",
"size": 10000
},
"aggs": {
"days": {
"date_histogram": {
"field": "timestamp",
"interval": "day",
"format": "dd"
},
"aggs": {
"hours": {
"date_histogram": {
"field": "timestamp",
"interval": "hour",
"format": "yyyy-MM-dd:HH"
},
"aggs": {
"hour_income": {
"sum": {
"field": "price"
}
}
}
},
"max_income": {
"max_bucket": {
"buckets_path": "hours>hour_income"
}
},
"day_income": {
"sum_bucket": {
"buckets_path": "hours.hour_income"
}
},
"more_than_sixty_percent": {
"bucket_script": {
"buckets_path": {
"dayIncome": "day_income",
"maxIncome": "max_income"
},
"script": "params.maxIncome - params.dayIncome * 60 / 100 > 0 ? 1 : 0"
}
}
}
},
"amount_of_days": {
"sum_bucket": {
"buckets_path": "days.more_than_sixty_percent"
}
},
"bucket_filter": {
"bucket_selector": {
"buckets_path": {
"amountOfDays": "amount_of_days"
},
"script": "params.amountOfDays >= 3"
}
}
}
}
}
}
我得到的响应是几百万行JSON,由桶组成。每个桶有超过700行(和它自己的桶),但我需要的只是它的键,所以我有我的代码列表。我觉得它的响应比必要的大几千倍并不好,解析时可能会出现问题。所以我想问一下,有没有办法隐藏存储桶中的其他信息并只获取密钥?
感谢。