Elasticsearch:两个字段范围查询

时间:2015-12-01 18:25:11

标签: elasticsearch range aggregation

我有一系列范围,让我们说[1-5],[6-10],[11-15]。 我正在寻找的是,每个范围的 ," start" field位于范围中的第一个元素之前,其" end" field位于范围中的第二个元素之后。

使用elasticsearch,我可以通过以下查询完成此操作:

GET my_index/_search
    {
      "size": 0,
      "query": {
        "filtered": {
          "filter": {
            "bool": {
              "must": [
                {
                  "range": {
                    "end": {
                      "gte": 5
                    }
                  }
                },
                {
                  "range": {
                    "start": {
                      "lte": 1
                    }
                  }
                }
              ]
            }
          }
        }
      },
      "aggs": {
        "value": {
          "terms": {
            "field": "value",
            "size": 100
          }
        }
      }
    }

如何在一个查询中使用多个范围来完成此操作?

在伪代码中,它就像是,

 for each range: 
   must:
      start < range[0],
      end > range[1]

测试数据:

PUT /test_index
{"settings": {"number_of_shards": 1}}

POST /test_index/doc/_bulk
{"index":{"_id":1}}
{"start":0, "end":10, "value":2}
{"index":{"_id":2}}
{"start":0, "end":12, "value":1}
{"index":{"_id":3}}
{"start":2, "end":11, "value":2}
{"index":{"_id":4}}
{"start":11, "end":13, "value":3}

预期产出:

{
   ...
   "aggregations": {
      "my_ranges": {
         "buckets": [
            {
               "key": "1-5", # i.e. range 1-5
               "doc_count": 2,
               "value": {
                  "doc_count_error_upper_bound": 0,
                  "sum_other_doc_count": 0,
                  "buckets": [
                     {
                        "key": "2",
                        "doc_count": 1
                     },
                     {
                        "key": "1",
                        "doc_count": 1
                     }
                  ]
               }
            },
            {
               "key": "6-10", # i.e. range 6-10
               "doc_count": 1,
               "value": {
                  "doc_count_error_upper_bound": 0,
                  "sum_other_doc_count": 0,
                  "buckets": [
                     {
                        "key": "2",
                        "doc_count": 1
                     }
                  ]
               }
            }
         ]
      }
   }

1 个答案:

答案 0 :(得分:0)

如果我的要求正确并假设您的范围字段为integer,那么Histogram聚合就可以了。 试试这个

   {
      "aggs": {
        "ranges": {
          "histogram": {
            "field": "range",
            "interval": 5
          },
          "aggs": {
            "values": {
              "terms": {
                "field": "value",
                "size": 100
              }
            }
          }
        }
      },
      "size": 0
    }