ElasticSearch日期范围内的第一个和最后一个值以及其他聚合

时间:2018-03-16 16:43:44

标签: elasticsearch

我有像这样的弹性搜索索引的数据,这是我期望数据在sku_id上分组的输出,我需要整个日期范围的平均排名,并且在日期范围内,last_7days_avg_rank和last的第一个值last_7days_avg_rank的值将日期作为2个单独的字段,如下所示

如果弹性搜索中有可能,有人可以告诉我吗?现在我正在服务层进行这种计算,但由于响应时间已经成为UN可接受的,我想将这个逻辑移到ES本身,但是无法弄清楚如何实现这一点?

输入:

 date     sku_id last_7days_avg_rank rank 
 20180101  S1      200                200
 20180102  S1      210                200
 20180105  S1      220                200
 20180108  S1      230                200

 20180101  S2      180                300
 20180103  S2      200                300
 20180106  S2      250                300
 20180107  S2      300                300

预期产出:

sku  first_val_last7day_avg  last_val_last7days_avg  avg(rank)   
S1    200                       230                  200
S2    180                       300                  300

谢谢!

1 个答案:

答案 0 :(得分:5)

您可以使用聚合

获得所需的结果
{

   "size": 0,
   "aggs": {
      "GROUP": {
         "terms": {
            "field": "sku_id"
         },
         "aggs": {
            "AVG_RANK": {
               "avg": {
                  "field": "rank"
               }
            },
            "FIRST_7_RANK": {
               "top_hits": {
                  "size": 1,
                  "sort": [
                     {
                        "my_date": {
                           "order": "asc"
                        }
                     }
                  ]
               }
            },
            "LAST_7_RANK": {
               "top_hits": {
                  "size": 1,
                  "sort": [
                     {
                        "my_date": {
                           "order": "desc"
                        }
                     }
                  ]
               }
            }
         }
      }
   }
}

您可以获得以下结果作为输出:

 "aggregations": {
      "GROUP": {
         "doc_count_error_upper_bound": 0,
         "sum_other_doc_count": 0,
         "buckets": [
            {
               "key": "S1",
               "doc_count": 40,
               "LAST_7_RANK": {
                  "hits": {
                     "total": 40,
                     "max_score": null,
                     "hits": [
                        {
                           "_index": "index_name",
                           "_type": "type_name",
                           "_id": "AWI9MU6JeKRzn3ttxGOr",
                           "_score": null,
                           "_source": {
                              "my_date": "2018-01-08",
                              "sku_id": "S1",
                              "last_7days_avg_rank": 230,
                              "rank": 200
                           },
                           "sort": [
                              1515369600000
                           ]
                        }
                     ]
                  }
               },
               "AVG_RANK": {
                  "value": 200
               },
               "FIRST_7_RANK": {
                  "hits": {
                     "total": 40,
                     "max_score": null,
                     "hits": [
                        {
                           "_index": "index_name",
                           "_type": "type_name",
                           "_id": "AWI9LYVpeKRzn3ttxGOQ",
                           "_score": null,
                           "_source": {
                              "my_date": "20180101",
                              "sku_id": "S1",
                              "last_7days_avg_rank": 200,
                              "rank": 200
                           },
                           "sort": [
                              20180101
                           ]
                        }
                     ]
                  }
               }
            },
            {
               "key": "S2",
               "doc_count": 40,
               "LAST_7_RANK": {
                  "hits": {
                     "total": 40,
                     "max_score": null,
                     "hits": [
                        {
                           "_index": "index_name",
                           "_type": "type_name",
                           "_id": "AWI9MU6JeKRzn3ttxGOv",
                           "_score": null,
                           "_source": {
                              "my_date": "2018-01-07",
                              "sku_id": "S2",
                              "last_7days_avg_rank": 300,
                              "rank": 300
                           },
                           "sort": [
                              1515283200000
                           ]
                        }
                     ]
                  }
               },
               "AVG_RANK": {
                  "value": 300
               },
               "FIRST_7_RANK": {
                  "hits": {
                     "total": 40,
                     "max_score": null,
                     "hits": [
                        {
                           "_index": "index_name",
                           "_type": "type_name",
                           "_id": "AWI9LYVpeKRzn3ttxGOU",
                           "_score": null,
                           "_source": {
                              "my_date": "20180101",
                              "sku_id": "S2",
                              "last_7days_avg_rank": 180,
                              "rank": 300
                           },
                           "sort": [
                              20180101
                           ]
                        }
                     ]
                  }
               }
            }
         ]
      }
   }

以上结果为S1和S2创建了两个存储桶(组)。并且在每个桶中,您可以在AVG_RANK字段中获得该组的平均排名,对于 first_val_last7day_avg ,您需要跟踪“FIRST_7_RANK”的值 - > “hits” - >“hits” - >“_ source” - >“rank”,类似地,对于 last_val_last7days_avg ,您需要恍惚“LAST_7_RANK” - > “命中” - > “中命中” - > “中_源” - > “中等级” 我希望这可以帮到你