elasticsearch查询aggs排序最大日期

时间:2014-06-13 21:37:06

标签: filter elasticsearch grouping aggregation

我有这样的数据:

Id GroupId UpdateDate
1 1 2013-11-15T12:00:00
2 1 2013-11-20T12:00:00
3 2 2013-12-01T12:00:00
4 2 2013-13-01T12:00:00
5 2 2013-11-01T12:00:00
6 3 2013-10-01T12:00:00

如何编写查询以将过滤/分组的列表返回到最大UpdateDate foreach组?并且最终列表由UpdateDate排序。

我期待这个输出:

Id GroupId UpdateDate
4 2 2013-13-01T12:00:00
2 1 2013-11-20T12:00:00
6 3 2013-10-01T12:00:00

谢谢你:)

1 个答案:

答案 0 :(得分:1)

是的,这可以通过elasticsearch实现,但数据将采用JSON格式,需要以您在上面显示的格式展平。以下是我使用Marvel Sense

的方法

批量加载数据:

POST myindex/mytype/_bulk
{"index":{}}
{"id":1,"GroupId":1,"UpdateDate":"2013-11-15T12:00:00"}
{"index":{}}
{"id":2,"GroupId":1,"UpdateDate":"2013-11-20T12:00:00"}
{"index":{}}
{"id":3,"GroupId":2,"UpdateDate":"2013-12-01T12:00:00"}
{"index":{}}
{"id":4,"GroupId":2,"UpdateDate":"2013-12-01T12:00:00"}
{"index":{}}
{"id":5,"GroupId":2,"UpdateDate":"2013-11-01T12:00:00"}
{"index":{}}
{"id":6,"GroupId":3,"UpdateDate":"2013-10-01T12:00:00"}

按群组获取最大值:

GET myindex/mytype/_search?search_type=count
{
  "aggs": {
    "NAME": {
      "terms": {
        "field": "GroupId"
      },
      "aggs": {
        "NAME": {
          "max": {
            "field": "UpdateDate"
          }
        }
     }
    }
  }
}

<强>输出:

{
...
   "aggregations": {
      "NAME": {
         "buckets": [
            {
               "key": 2,
               "doc_count": 3,
               "NAME": {
                 "value": 1385899200000
              }
           },
            {
               "key": 1,
               "doc_count": 2,
               "NAME": {
                  "value": 1384948800000
               }
            },
            {
               "key": 3,
               "doc_count": 1,
               "NAME": {
                  "value": 1380628800000
               }
            }
         ]
      }
   }
...
}

最大日期作为Linux时间返回,需要转换回可读日期格式。