弹性搜索日期范围聚合

时间:2014-08-02 10:23:01

标签: elasticsearch

我有一个Json数据

"hits": [
         {
            "_index": "outboxprov1",
            "_type": "deleted-connector",
            "_id": "AHkuN5_iRGO-R5dtaOvz6w",
            "_score": 1,
            "_source": {
               "user_id": "1a9d05586a8dc3f29b4c8147997391f9",
               "deleted_date": "2014-08-02T04:55:04.509Z"
            }
         },
         {
            "_index": "outboxprov1",
            "_type": "deleted-connector",
            "_id": "Busk7MDFQ4emtL3x5AQyZA",
            "_score": 1,
            "_source": {
               "user_id": "1a9d05586a8dc3f29b4c8147997391f9",
               "deleted_date": "2014-08-02T04:58:31.440Z"
            }
         },
         {
            "_index": "outboxprov1",
            "_type": "deleted-connector",
            "_id": "4AN0zKe9SaSF1trz1IixfA",
            "_score": 1,
            "_source": {
               "user_id": "1a9d05586a8dc3f29b4c8147997391f9",
               "deleted_date": "2014-07-02T04:53:07.010Z"
            }
         }
]

尝试编写聚合查询,该查询将查找特定于“deleted_date”范围的记录。 这是我的查询

{
  "size": 0,
  "query": {
    "match_all": {}
  },
  "aggs": {
    "daily_team": {
      "date_range": {
        "field": "deleted_date",
         "format": "YYYY-MM-DD",
        "ranges": [
          {
            "from": "2014-08-02"
          },
          {
            "to": "2014-08-02"
          }
        ]
      },
      "aggs": {
        "daily_team_count": {
          "terms": {
            "field": "user_id"
          }
        }
      }
    }
  }
}

我的问题是在特定日期范围内没有获得正确的记录数。当我把任何日期我得到一些doc_count号码。我是弹性搜索新手。我不确定它是编写范围聚合查询的方法。请帮我解决这个问题。

2 个答案:

答案 0 :(得分:3)

我认为问题是您与日期范围聚合的“从”和“到”混淆,并使用范围过滤器。范围过滤器默认包括日期(从和到)。但是在date_range聚合中,包含from值并排除每个范围的to值..

在您的查询中,

{
  "size": 0,
  "query": {
    "match_all": {}
  },
  "aggs": {
    "daily_team": {
      "date_range": {
        "field": "deleted_date",
         "format": "YYYY-MM-DD",
        "ranges": [
          {
            "from": "2014-08-02"
          },
          {
            **"to": "2014-08-02"** -- > if you want to include 2014-08-02 date then do,
              "to" : "2014-08-03" (increase date by one, so 08-02 is included) 
          }
        ]
      },
      "aggs": {
        "daily_team_count": {
          "terms": {
            "field": "user_id"
          }
        }
      }
    }
  }
}

这也是我遇到的,我认为你的问题也是一样的。

仅供参考,请看link

答案 1 :(得分:1)

OP正在寻找的是InternalDateRange查询。试试这个:

 {
  "size": 0,
  "query": {
    "match_all": {}
  },
  "aggs": {
    "daily_team": {
      "date_range": {
        "field": "deleted_date",
        "format": "YYYY-MM-DD",
        "ranges": [
          {
            "from": "2014-08-02||/d",   // /d rounds off to day
                                        // from value -> 2014-08-02T00:00:00.000Z
            "to": "2014-08-03||/d"      // to value -> 2014-08-03T00:00:00.000Z
          }
        ]
      },
      "aggs": {
        "daily_team_count": {
          "terms": {
            "field": "user_id"
          }
        }
      }
    }
  }
}

这将返回名为daily_team的单个存储桶中匹配结果的计数。

"buckets": [
           {
              "key": "2014-08-02T00:00:00.000Z-2014-08-03T00:00:00.000Z",
              "from": 1470096000000,    //test data value
              "from_as_string": "2014-08-02T00:00:00.000Z",
              "to": 1470182400000,      //test data value
              "to_as_string": "2014-08-03T00:00:00.000Z",
              "doc_count": 0
           }
        ]

这将返回包含匹配doc_count的单个存储桶。

"ranges": [
      {
        "from": "2014-08-02"
      },
      {
        "to": "2014-08-02"                
      }

使用以上范围将返回2个存储桶,fromto日期范围各一个。

from -> 2014-08-02-* to -> *-2014-08-02,如official documentation page所示。