JSON解析-使用JQ按日期范围分组

时间:2019-01-15 13:36:18

标签: json date grouping aggregate jq

我有以下JSON数据:

{
  "data": [  
    {"date": "2018-08-22","value": 1},
    {"date": "2018-08-30","value": 2},
    {"date": "2018-09-01","value": 3},
    {"date": "2018-09-22","value": 3},
    {"date": "2018-09-28","value": 4},
    {"date": "2018-10-18","value": 5},
    {"date": "2018-10-23","value": 6}
  ]
}

我想使用JQ从第一个值开始每月对数据进行分组: 我的月份是:

  • 2018 08 22至2018 09 21

  • 2018 09 22至2018 10 21

  • 2018 10 22至2018 11 21

我的预期输出如下:

{
  "data": [  
    {"month": "2018-08-22 to 2018-09-21","sum": 6},
    {"month": "2018-09-22 to 2018-10-21","sum": 12},
    {"month": "2018-10-23 to 2018-11-21","sum": 6}
  ]
}

我该怎么做?

1 个答案:

答案 0 :(得分:1)

A solution is easily obtained with the following helper function:

# $day should be the day (an integer) that defines the cutoff point;
# input should be a Y-M-D string
def bucket($day):
  def prev: if .m > 1 then .m -= 1 else .m = 12 | .y -= 1 end;
  def fmt: if .m <= 9 then "\(.y)-0\(.m)" else "\(.y)-\(.m)" end;
  capture("(?<y>[0-9]+)-(?<m>[0-9]+)-(?<d>[0-9]+)")
  | map_values(tonumber)
  | if .d >= $day then . else prev end
  | fmt ;

We now compute the bucket-count pairs:

.data
| (.[0].date | capture("(?<d>[0-9]+)$") | .d | tonumber) as $day
| reduce .[] as $pair ({};
  .[$pair.date | bucket($day)] += ($pair.value) )

With the sample data, this produces:

{
  "2018-08": 6,
  "2018-09": 12,
  "2018-10": 6
}      

It is now a trivial matter to convert this into the desired format, and is thus left as an exercise.