弹性搜索日期具有特定时间范围的直方图聚合

时间:2015-11-16 14:14:53

标签: elasticsearch

我们正在针对特定日期范围执行3级聚合,我们需要获取不同的"网站"名称按不同的" HitCount"分组按" DateTime"分组的值间隔。这里,直方图聚合允许我们获取基于区间的文档,但是" key_as_string"日期始终从上午12点开始考虑,而不是查询中提供的日期范围时间。根据间隔期间值,分割当天(从时间的上午12点开始的24小时)并给出聚合输出。

For e.g. we have given the from time as "2015-11-10T11:00:00" and To time as "2015-11-13T11:00:00" with interval of 8 hrs

Following is the query used:

{
  "size": 0,
  "query": {
    "filtered": {
      "filter": {
        "bool": {
          "must": [
            {
              "range": {
                "DateTime": {
                  "from": "2015-11-10T11:00:00",
                  "to": "2015-11-13T11:00:00"
                }
              }
            }
          ]
        }
      }
    }
  },
  "aggs": {
    "Website": {
      "terms": {
        "field": "Website",
        "size": 0,
        "order": {
          "_count": "desc"
        }
      },
      "aggs": {
        "HitCount": {
          "terms": {
            "field": "HitCount",
            "size": 0,
            "order": {
              "_count": "desc"
            }
          },
          "aggs": {
            "DateTime": {
              "date_histogram": {
                "field": "DateTime",
                "interval": "8h",
                "min_doc_count": 0,
                "extended_bounds": {
                  "min": 1447153200000,
                  "max": 1447412400000
                }
              }
            }
          }
        }
      }
    }
  }
}

The query Output wrt 3rd level DateTime aggregation is:

"DateTime": {
"buckets": [
{
"key_as_string": "2015-11-10T08:00:00.000Z",
"key": 1447142400000,
"doc_count": 62698
}
,
{
"key_as_string": "2015-11-10T16:00:00.000Z",
"key": 1447171200000,
"doc_count": 248118
}
,
{
"key_as_string": "2015-11-11T00:00:00.000Z",
"key": 1447200000000,
"doc_count": 224898
}
,
{
"key_as_string": "2015-11-11T08:00:00.000Z",
"key": 1447228800000,
"doc_count": 221663
}
,
{
"key_as_string": "2015-11-11T16:00:00.000Z",
"key": 1447257600000,
"doc_count": 220935
}
,
{
"key_as_string": "2015-11-12T00:00:00.000Z",
"key": 1447286400000,
"doc_count": 219340
}
,
{
"key_as_string": "2015-11-12T08:00:00.000Z",
"key": 1447315200000,
"doc_count": 218452
}
,
{
"key_as_string": "2015-11-12T16:00:00.000Z",
"key": 1447344000000,
"doc_count": 190
}
,
{
"key_as_string": "2015-11-13T00:00:00.000Z",
"key": 1447372800000,
"doc_count": 0
}
,
{
"key_as_string": "2015-11-13T08:00:00.000Z",
"key": 1447401600000,
"doc_count": 0
}
]
}


Expected Output:

Here, we would expect the intervals to be divided and queried as:
2015-11-10T11:00:00 to 2015-11-10T19:00:00
2015-11-10T19:00:00 to 2015-11-11T03:00:00
2015-11-11T03:00:00 to 2015-11-11T11:00:00
2015-11-11T11:00:00 to 2015-11-11T19:00:00
2015-11-11T19:00:00 to 2015-11-12T03:00:00
2015-11-12T03:00:00 to 2015-11-12T11:00:00
2015-11-12T11:00:00 to 2015-11-12T19:00:00
2015-11-12T19:00:00 to 2015-11-13T03:00:00
2015-11-13T03:00:00 to 2015-11-13T11:00:00


ie. the "key_as_string" output value should be 2015-11-10T11:00:00, 2015-11-10T19:00:00, .... and so on

The above is required as we have given a From & to time of 11 AM so that it can be a updated value of every 8 hrs whenever we fire the query rather than getting a fixed range of time for the whole day.

Note: ES 1.7 is used

1 个答案:

答案 0 :(得分:2)

文档说明您可以使用here

所以

{
  "size": 0,
  "query": {
    "filtered": {
      "filter": {
        "bool": {
          "must": [
            {
              "range": {
                "DateTime": {
                  "from": "2015-11-10T11:00:00",
                  "to": "2015-11-13T11:00:00"
                }
              }
            }
          ]
        }
      }
    }
  },
  "aggs": {
    "Website": {
      "terms": {
        "field": "Website",
        "size": 0,
        "order": {
          "_count": "desc"
        }
      },
      "aggs": {
        "HitCount": {
          "terms": {
            "field": "HitCount",
            "size": 0,
            "order": {
              "_count": "desc"
            }
          },
          "aggs": {
            "DateTime": {
              "date_histogram": {
                "field": "DateTime",
                "interval": "8h",
                "min_doc_count": 0,
                "offset": "+11h"
              }
            }
          }
        }
      }
    }
  }
}