我有一系列范围,让我们说[1-5],[6-10],[11-15]。 我正在寻找的是,每个范围的 ," start" field位于范围中的第一个元素之前,其" end" field位于范围中的第二个元素之后。
使用elasticsearch,我可以通过以下查询完成此操作:
GET my_index/_search
{
"size": 0,
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"range": {
"end": {
"gte": 5
}
}
},
{
"range": {
"start": {
"lte": 1
}
}
}
]
}
}
}
},
"aggs": {
"value": {
"terms": {
"field": "value",
"size": 100
}
}
}
}
如何在一个查询中使用多个范围来完成此操作?
在伪代码中,它就像是,
for each range:
must:
start < range[0],
end > range[1]
测试数据:
PUT /test_index
{"settings": {"number_of_shards": 1}}
POST /test_index/doc/_bulk
{"index":{"_id":1}}
{"start":0, "end":10, "value":2}
{"index":{"_id":2}}
{"start":0, "end":12, "value":1}
{"index":{"_id":3}}
{"start":2, "end":11, "value":2}
{"index":{"_id":4}}
{"start":11, "end":13, "value":3}
预期产出:
{
...
"aggregations": {
"my_ranges": {
"buckets": [
{
"key": "1-5", # i.e. range 1-5
"doc_count": 2,
"value": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "2",
"doc_count": 1
},
{
"key": "1",
"doc_count": 1
}
]
}
},
{
"key": "6-10", # i.e. range 6-10
"doc_count": 1,
"value": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "2",
"doc_count": 1
}
]
}
}
]
}
}
答案 0 :(得分:0)
如果我的要求正确并假设您的范围字段为integer
,那么Histogram聚合就可以了。
试试这个
{
"aggs": {
"ranges": {
"histogram": {
"field": "range",
"interval": 5
},
"aggs": {
"values": {
"terms": {
"field": "value",
"size": 100
}
}
}
}
},
"size": 0
}