如何在date_histogram聚合上添加大小

时间:2017-09-26 10:44:22

标签: elasticsearch

我在elasticsearch中执行查询。我需要为我的属性提供点击次数" end_date_ut" (对于索引中表示的每个月,类型为日期和格式为dateOptionalTime)。 为此,我使用了date_histogram聚合。

我的查询吼叫:

GET inc/_search
{
  "size": 0,
  "aggs": {
    "appli": {
      "date_histogram": {
        "field": "end_date_ut",
        "interval": "month"
      }
    }
  }
}

以下是结果的一部分:

"hits": {
    "total": 517478,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "appli": {
      "buckets": [
        {
          "key_as_string": "2009-08-01T00:00:00.000Z",
          "key": 1249084800000,
          "doc_count": 0
        },
        {
          "key_as_string": "2009-09-01T00:00:00.000Z",
          "key": 1251763200000,
          "doc_count": 1
        },
        {
          "key_as_string": "2009-10-01T00:00:00.000Z",
          "key": 1254355200000,
          "doc_count": 2362
        },
        {
          "key_as_string": "2009-11-01T00:00:00.000Z",
          "key": 1257033600000,
          "doc_count": 5336
        },
        {
          "key_as_string": "2009-12-01T00:00:00.000Z",
          "key": 1259625600000,
          "doc_count": 7536
        },
        {
          "key_as_string": "2010-01-01T00:00:00.000Z",
          "key": 1262304000000,
          "doc_count": 8864
        }

问题是我有太多的桶(结果)。当我使用"术语聚合"时,我没有任何问题,因为我可以设置大小,但使用" date_histogram aggregation"我无法找到限制查询结果的方法。

2 个答案:

答案 0 :(得分:1)

{
    "size": 0,
    "aggs": {
        "by_minute": {
            "date_histogram": {
                "field": "createTime",
                "interval": "1m",
                "order": {
                    "_count": "desc"
                }
            },
            "aggs": {
                "top2": {
                    "bucket_sort": {
                        "sort": [],
                        "size": 2
                    }
                }
            }
        }
    }
}
{
    "took": 28,
    "timed_out": false,
    "_shards": {
        "total": 2,
        "successful": 2,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": 999999,
        "max_score": 0.0,
        "hits": []
    },
    "aggregations": {
        "by_minute": {
            "buckets": [
                {
                    "key_as_string": "2019-12-21T16:13:00.000Z",
                    "key": 1576944780000,
                    "doc_count": 6374
                },
                {
                    "key_as_string": "2019-12-21T16:10:00.000Z",
                    "key": 1576944600000,
                    "doc_count": 6327
                }
            ]
        }
    }
}

答案 1 :(得分:0)

我建议使用min_doc_count仅包含具有数据的存储桶,即带有0个文档的存储桶不会在响应中返回。

GET inc/_search
{
  "size": 0,
  "aggs": {
    "appli": {
      "date_histogram": {
        "field": "end_date_ut",
        "interval": "month",
        "min_doc_count": 1          <--- add this
      }
    }
  }
}

如果可以,您还可以添加range查询,以限制运行聚合的时间间隔。