在弹性搜索中给出一个条件来总结一堆值

时间:2015-11-30 16:44:05

标签: elasticsearch

鉴于以下弹性搜索文档,我将如何构建一个搜索,该搜索将对给定日期时间范围的秒列值求和?

请参阅下面的我当前的查询。

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 1,
    "hits": [
      {
        "_index": "searchdb",
        "_type": "profile",
        "_id": "1825",
        "_score": 1,
        "_source": {
          "id": 1825,
          "market": "Chicago",
          "geo_location": {
            "lat": 41.1234,
            "lon": -87.5678
          },
          "hourly_values": [
            {
              "datetime": "1997-07-16T19:00:00.00+00:00",
              "seconds": 1200
            },
            {
              "datetime": "1997-07-16T19:20:00.00+00:00",
              "seconds": 1200
            },
            {
              "datetime": "1997-07-16T19:20:00.00+00:00",
              "seconds": 1200
            }
          ]
        }
      },
      {
        "_index": "searchdb",
        "_type": "profile",
        "_id": "1808",
        "_score": 1,
        "_source": {
          "id": 1808,
          "market": "Chicago",
          "geo_location": {
            "lat": 41.1234,
            "lon": -87.5678
          },
          "hourly_values": [
            {
              "datetime": "1997-07-16T19:00:00.00+00:00",
              "seconds": 900
            },
            {
              "datetime": "1997-07-16T19:20:00.00+00:00",
              "seconds": 1200
            },
            {
              "datetime": "1997-07-16T19:20:00.00+00:00",
              "seconds": 800
            }
          ]
        }
      }
    ]
  }

以下是我当前的查询。它的问题是它没有考虑日期时间字段。我需要能够只对查询中给定日期时间范围内的秒值求和。

{
    "aggs": {
        "Ids": {
            "terms": {
                "field": "id",
                "size": 0
            },
            "aggs": {
                "Nesting": {
                    "nested": {
                        "path": "hourly_values"
                    },
                    "aggs": {
                        "availability": {
                            "sum": {
                                "field": "hourly_values.seconds"
                            }
                        }
                    }
                }
            }
        }
    }
} 

我知道你可以使用范围,如下:

"filter" : {
                "range" : { "timestamp" : { "from" : "now/1d+9.5h", "to" : "now/1d+16h" }}
            }

但我无法弄清楚如何将其集成到我的查询中以获得所需的输出。

为清楚起见,我想要的输出是返回从查询返回的每个对象,以及秒字段总和的值,但我只想对给定时间范围的值求和。

1 个答案:

答案 0 :(得分:1)

我认为可以使用filter aggregation

完成此操作

试试这个

{
  "aggs": {
    "Ids": {
      "terms": {
        "field": "id",
        "size": 0
      },
      "aggs": {
        "Nesting": {
          "nested": {
            "path": "hourly_values"
          },
          "aggs": {
            "filtered_result": {
              "filter": {
                "query": {
                  "range": {
                    "hourly_values.datetime": {
                      "gt": "1997-07-16T19:10:00.00+00:00",
                      "lt": "1997-07-16T19:22:00.00+00:00"
                    }
                  }
                }
              },
              "aggs": {
                "availability": {
                  "sum": {
                    "field": "hourly_values.seconds"
                  }
                }
              }
            }
          }
        }
      }
    }
  },
  "size": 0
} 

我得到的结果

"aggregations": {
      "Ids": {
         "doc_count_error_upper_bound": 0,
         "sum_other_doc_count": 0,
         "buckets": [
            {
               "key": "1808",
               "doc_count": 1,
               "Nesting": {
                  "doc_count": 3,
                  "filtered_result": {
                     "doc_count": 2,
                     "availability": {
                        "value": 2000
                     }
                  }
               }
            },
            {
               "key": "1825",
               "doc_count": 1,
               "Nesting": {
                  "doc_count": 3,
                  "filtered_result": {
                     "doc_count": 2,
                     "availability": {
                        "value": 2400
                     }
                  }
               }
            }
         ]
      }
   }

这有帮助吗?