我有以下对象数组(这只是一个摘录,对象也更大):
[{
"DATE": "10.10.2017 01:00",
"ID": "X",
"VALUE_ONE": 20,
"VALUE_TWO": 5
},
{
"DATE": "10.10.2017 02:00",
"ID": "X",
"VALUE_ONE": 30,
"VALUE_TWO": 7
},
{
"DATE": "10.10.2017 03:00",
"ID": "X",
"VALUE_ONE": 25,
"VALUE_TWO": 2
},
{
"DATE": "10.10.2017 01:00",
"ID": "Y",
"VALUE_ONE": 10,
"VALUE_TWO": 9
},
{
"DATE": "10.10.2017 02:00",
"ID": "Y",
"VALUE_ONE": 20,
"VALUE_TWO": 5
},
{
"DATE": "10.10.2017 03:00",
"ID": "Y",
"VALUE_ONE": 50,
"VALUE_TWO": 5
},
{
"DATE": "10.10.2017 01:00",
"ID": "Z",
"VALUE_ONE": 55,
"VALUE_TWO": 3
},
{
"DATE": "10.10.2017 02:00",
"ID": "Z",
"VALUE_ONE": 60,
"VALUE_TWO": 7
},
{
"DATE": "10.10.2017 03:00",
"ID": "Z",
"VALUE_ONE": 15,
"VALUE_TWO": 7
}
]
为了简化Web应用程序的这一过程,并减少文件大小,我想将"VALUE_ONE"
,"VALUE_TWO"
和"DATE"
值转换为每个“ID”的数组像这样:
[{
"DATE": ["10.10.2017 01:00", "10.10.2017 02:00", "10.10.2017 03:00"],
"ID": "X",
"VALUE_ONE": [20, 30, 25],
"VALUE_TWO": [5, 7, 2]
},
{
"DATE": ["10.10.2017 01:00", "10.10.2017 02:00", "10.10.2017 03:00"],
"ID": "Y",
"VALUE_ONE": [10, 20, 50],
"VALUE_TWO": [9, 5, 5]
},
{
"DATE": ["10.10.2017 01:00", "10.10.2017 02:00", "10.10.2017 03:00"],
"ID": "Z",
"VALUE_ONE": [55, 60, 15],
"VALUE_TWO": [3, 7, 7]
}
]
在此重要的是,您需要能够找到与特定时间(日期)相关联的值。由于"DATE"
的输入值是连续的,因此您很可能不再需要DATE
值来查找请求的"VALUE.."
值。你可以只使用数组的索引(index=0
总是10.10.2017 01:00
,index=1
是...... 02:00
等等。
有可能这样做吗?这将使文件大小更小。
谢谢!
答案 0 :(得分:1)
2步减少(它看起来不漂亮但有效):
jq 'reduce group_by(.ID)[] as $a ([]; . + [ reduce $a[] as $o
({"DATE":[],"VALUE_ONE":[],"VALUE_TWO":[]};
.DATE |= .+ [$o.DATE] | .ID = $o.ID |.VALUE_ONE |= .+ [$o.VALUE_ONE]
| .VALUE_TWO |= .+ [$o.VALUE_TWO]) ] )' input.json
输出:
[
{
"DATE": [
"10.10.2017 01:00",
"10.10.2017 02:00",
"10.10.2017 03:00"
],
"VALUE_ONE": [
20,
30,
25
],
"VALUE_TWO": [
5,
7,
2
],
"ID": "X"
},
{
"DATE": [
"10.10.2017 01:00",
"10.10.2017 02:00",
"10.10.2017 03:00"
],
"VALUE_ONE": [
10,
20,
50
],
"VALUE_TWO": [
9,
5,
5
],
"ID": "Y"
},
{
"DATE": [
"10.10.2017 01:00",
"10.10.2017 02:00",
"10.10.2017 03:00"
],
"VALUE_ONE": [
55,
60,
15
],
"VALUE_TWO": [
3,
7,
7
],
"ID": "Z"
}
]
答案 1 :(得分:0)
以下解决方案避免了group_by
,原因有两个:
sort
使用的group_by
可能不稳定,这会让事情变得复杂。相反,我们使用bucketize
定义如下:
def bucketize(f): reduce .[] as $x ({}; .[$x|f] += [$x] );
为了简单起见,我们还将定义以下辅助函数:
# compactify an array with a single ID
def compact:
. as $in
| reduce (.[0]|keys_unsorted[]) as $key ({};
. + {($key): $in|map(.[$key])})
+ {"ID": .[0].ID}
;
[bucketize(.ID)[] | compact]
即使日期集的ID不同,即使JSON对象最初没有按日期分组,这也可以确保一切正常。
(如果您想在最终结果中完全删除“DATE”,请在上面的行中将compact
的号码替换为compact | del(.DATE)
。)
[
{
"DATE": [
"10.10.2017 01:00",
"10.10.2017 02:00",
"10.10.2017 03:00"
],
"ID": "X",
"VALUE_ONE": [
20,
30,
25
],
"VALUE_TWO": [
5,
7,
2
]
},
{
"DATE": [
"10.10.2017 01:00",
"10.10.2017 02:00",
"10.10.2017 03:00"
],
"ID": "Y",
"VALUE_ONE": [
10,
20,
50
],
"VALUE_TWO": [
9,
5,
5
]
},
{
"DATE": [
"10.10.2017 01:00",
"10.10.2017 02:00",
"10.10.2017 03:00"
],
"ID": "Z",
"VALUE_ONE": [
55,
60,
15
],
"VALUE_TWO": [
3,
7,
7
]
}
]
答案 2 :(得分:0)
以下是使用reduce,setpath,getpath,del和symbolic variable destructuring的解决方案。它将收集并行数组中除ID
和DATE
以外的其他键的所有值(无需对VALUE_ONE
进行硬编码等)。
reduce (.[] | [.ID, .DATE, del(.ID,.DATE)]) as [$id,$date,$v] ({};
(getpath([$id, "DATE"])|length) as $idx
| setpath([$id, "ID"]; $id)
| setpath([$id, "DATE", $idx]; $date)
| reduce ($v|keys[]) as $k (.; setpath([$id, $k, $idx]; $v[$k]))
)
| map(.)
答案 3 :(得分:0)
如果您的数据集足够小,您可以按ID分组它们并映射到所需的结果。与流式传输解决方案相比,它不会超级高效,但使用内置函数最简单。
group_by(.ID) | map({
DATE: map(.DATE),
ID: .[0].ID,
VALUE_ONE: map(.VALUE_ONE),
VALUE_TWO: map(.VALUE_TWO)
})