我需要在一系列相同结构的嵌套JSON文件中合并一个数组,这些文件共享相同的更高级别的密钥。
目标是创建合并文件,同时保留所有现有的更高级别的键和值。
文件1:
{
"account": "123456789012",
"regions": [
{
"region": "one",
"services": [
{
"groups": [
{
"GroupId": "123456",
"GroupName": "foo"
},
{
"GroupId": "234567",
"GroupName": "bar"
}
]
}
]
}
]
}
文件2:
{
"account": "123456789012",
"regions": [
{
"region": "one",
"services": [
{
"group_policies": [
{
"GroupName": "foo",
"PolicyNames": [
"all_foo",
"all_bar"
]
},
{
"GroupName": "bar",
"PolicyNames": [
"all_bar"
]
}
]
}
]
}
]
}
预期结果:
{
"account": "123456789012",
"regions": [
{
"region": "one",
"services": [
{
"groups": [
{
"GroupId": "123456",
"GroupName": "foo"
},
{
"GroupId": "234567",
"GroupName": "bar"
}
]
},
{
"group_policies": [
{
"GroupName": "foo",
"PolicyNames": [
"all_foo",
"all_bar"
]
},
{
"GroupName": "bar",
"PolicyNames": [
"all_bar"
]
}
]
}
]
}
]
}
我根据对此类其他问题的回答尝试了以下内容但没有成功:
jq -s '.[0] * .[1]' test1.json test2.json
jq -s add test1.json test2.json
jq -n '[inputs[]]' test{1,2}.json
以下成功合并数组但在结果中缺少更高级别的键和值。
jq -s '.[0].regions[0].services[0] * .[1].regions[0].services[0]' test1.json test2.json
我假设有一个简单的jq解决方案可以逃避我的搜索。如果没有,jq和bash的任何组合都可以用于解决方案。
答案 0 :(得分:1)
这是一个解决方案,它将数组转换为对象,直到服务级别,与*
合并并转换回数组形式。如果file1
和file2
包含示例数据,则此命令为:
$ jq -Mn --argfile file1 file1 --argfile file2 file2 '
def merge: # merge function
($file1, $file2) # process $file1 then $file2
| .account as $a # save .account in $a
| .regions[] # for each element of .regions
| .region as $r # save .region in $r
| .services[] as $s # save each element of .services in $s
| {($a): {($r): $s}} # generate object for each account,region,service
# | debug # uncomment debug here to see stream
;
reduce merge as $x ({}; . * $x) # use '*' to recombine all the objects from merge
# | debug # uncomment debug here to see combined object
| keys[] as $a # for each key (account) of combined object
| {account:$a, regions:[ # construct object with {account, regions array}
.[$a] # for each account
| keys[] as $r # for each key (region) of account object
| {region:$r, services:[ # constuct object with {region, services array}
.[$r] # for each region
| keys[] as $s # for each service
| {($s): .[$s]} # generate service object
]} # add service objects to service array
]}' # add region object ot regions array
产生
{
"account": "123456789012",
"regions": [
{
"region": "one",
"services": [
{
"group_policies": [
{
"GroupName": "foo",
"PolicyNames": [
"all_foo",
"all_bar"
]
},
{
"GroupName": "bar",
"PolicyNames": [
"all_bar"
]
}
]
},
{
"groups": [
{
"GroupId": "123456",
"GroupName": "foo"
},
{
"GroupId": "234567",
"GroupName": "bar"
}
]
}
]
}
]
}
逐步组装此步骤可以更好地了解其工作原理。 从这个过滤器开始
def merge: # merge function
($file1, $file2) # process $file1 then $file2
| .account as $a # save .account in $a
| $a
;
merge
因为有两个对象(一个来自file1,一个来自file2),这个输出
每个.account
:
"123456789012"
"123456789012"
请注意,.account as $a
不会更改.
的当前值。
变量允许我们“钻取”到子对象而不会损失更高
级别上下文。考虑一下这个过滤器:
def merge: # merge function
($file1, $file2) # process $file1 then $file2
| .account as $a # save .account in $a
| .regions[] # for each element of .regions
| .region as $r # save .region in $r
| [$a, $r]
;
merge
输出(帐户,地区)对:
["123456789012","one"]
["123456789012","one"]
现在我们可以继续深入研究服务:
def merge: # merge function
($file1, $file2) # process $file1 then $file2
| .account as $a # save .account in $a
| .regions[] # for each element of .regions
| .region as $r # save .region in $r
| .services[]
| [$a, $r, .]
;
merge
此时数组的第三个元素(.
)指的是每个元素
.services
数组中的连续服务,因此此过滤器生成
["123456789012","one",{"groups":[{"GroupId":"123456","GroupName":"foo"},
{"GroupId":"234567","GroupName":"bar"}]}]
["123456789012","one",{"group_policies":[{"GroupName":"foo","PolicyNames":["all_foo","all_bar"]},
{"GroupName":"bar","PolicyNames":["all_bar"]}]}]
这个(完整的)合并功能:
def merge: # merge function
($file1, $file2) # process $file1 then $file2
| .account as $a # save .account in $a
| .regions[] # for each element of .regions
| .region as $r # save .region in $r
| .services[] as $s # save each element of .services in $s
| {($a): {($r): $s}} # generate object for each account,region,service
;
merge
生成流
{"123456789012":{"one":{"groups":[{"GroupId":"123456","GroupName":"foo"},
{"GroupId":"234567","GroupName":"bar"}]}}}
{"123456789012":{"one":{"group_policies":[{"GroupName":"foo","PolicyNames":["all_foo","all_bar"]},
{"GroupName":"bar","PolicyNames":["all_bar"]}]}}}
要注意的重要一点是,这些是可以轻松与*
合并的对象
通过减少步骤:
def merge: # merge function
($file1, $file2) # process $file1 then $file2
| .account as $a # save .account in $a
| .regions[] # for each element of .regions
| .region as $r # save .region in $r
| .services[] as $s # save each element of .services in $s
| {($a): {($r): $s}} # generate object for each account,region,service
;
reduce merge as $x ({}; . * $x) # use '*' to recombine all the objects from merge
reduce将其本地状态(.
)初始化为{}
然后
计算合并函数的每个结果的新状态
通过评估. * $x
,递归地组合对象合并
从$ file1和$ file:
{"123456789012":{"one":{"groups":[{"GroupId":"123456","GroupName":"foo"},
{"GroupId":"234567","GroupName":"bar"}],
"group_policies":[{"GroupName":"foo","PolicyNames":["all_foo","all_bar"]},
{"GroupName":"bar","PolicyNames":["all_bar"]}]}}}
请注意*
停止合并'groups'和'group_policies'键中的数组对象。
如果我们想继续合并,我们可以在合并函数中创建更多对象。例如
考虑这个扩展名:
def merge: # merge function
($file1, $file2) # process $file1 then $file2
| .account as $a # save .account in $a
| .regions[] # for each element of .regions
| .region as $r # save .region in $r
| .services[] as $s # save each element of .services in $s
| (
$s.groups[]? as $g
| {($a): {($r): {groups: {($g.GroupId): $g}}}}
), (
$s.group_policies[]? as $p
| {($a): {($r): {group_policies: {($p.GroupName): $p}}}}
)
;
merge
此合并比前一个更深,产生
{"123456789012":{"one":{"groups":{"123456":{"GroupId":"123456","GroupName":"foo"}}}}}
{"123456789012":{"one":{"groups":{"234567":{"GroupId":"234567","GroupName":"bar"}}}}}
{"123456789012":{"one":{"group_policies":{"foo":{"GroupName":"foo","PolicyNames":["all_foo","all_bar"]}}}}}
{"123456789012":{"one":{"group_policies":{"bar":{"GroupName":"bar","PolicyNames":["all_bar"]}}}}}
这里重要的是“groups”和“group_policies”键包含对象 这意味着在此过滤器中
def merge: # merge function
($file1, $file2) # process $file1 then $file2
| .account as $a # save .account in $a
| .regions[] # for each element of .regions
| .region as $r # save .region in $r
| .services[] as $s # save each element of .services in $s
| (
$s.groups[]? as $g
| {($a): {($r): {groups: {($g.GroupId): $g}}}}
), (
$s.group_policies[]? as $p
| {($a): {($r): {group_policies: {($p.GroupName): $p}}}}
)
;
reduce merge as $x ({}; . * $x)
reduce *
将合并组和组策略,而不是覆盖它们,生成:
{"123456789012":{"one":{"groups":{"123456":{"GroupId":"123456","GroupName":"foo"},
"234567":{"GroupId":"234567","GroupName":"bar"}},
"group_policies":{"foo":{"GroupName":"foo","PolicyNames":["all_foo","all_bar"]},
"bar":{"GroupName":"bar","PolicyNames":["all_bar"]}}}}}
将其重新放回原始形式需要更多工作,但不多:
def merge: # merge function
($file1, $file2) # process $file1 then $file2
| .account as $a # save .account in $a
| .regions[] # for each element of .regions
| .region as $r # save .region in $r
| .services[] as $s # save each element of .services in $s
| (
$s.groups[]? as $g
| {($a): {($r): {groups: {($g.GroupId): $g}}}}
), (
$s.group_policies[]? as $p
| {($a): {($r): {group_policies: {($p.GroupName): $p}}}}
)
;
reduce merge as $x ({}; . * $x)
| keys[] as $a # for each key (account) of combined object
| {account:$a, regions:[ # construct object with {account, regions array}
.[$a] # for each account
| keys[] as $r # for each key (region) of account object
| {region:$r, services:[ # constuct object with {region, services array}
.[$r] # for each region
| {groups: [.groups[]]} # add groups to service
, {group_policies: [.group_policies[]]} # add group_policies to service
]}
]}
现在使用此版本假设我们的file2包含一个组以及group_policies。 e.g
{
"account": "123456789012",
"regions": [
{
"region": "one",
"services": [
{
"groups": [
{
"GroupId": "999",
"GroupName": "baz"
}
]
},
{
"group_policies": [
{
"GroupName": "foo",
"PolicyNames": [
"all_foo",
"all_bar"
]
},
{
"GroupName": "bar",
"PolicyNames": [
"all_bar"
]
}
]
}
]
}
]
}
此解决方案的第一个版本产生
{
"account": "123456789012",
"regions": [
{
"region": "one",
"services": [
{
"group_policies": [
{
"GroupName": "foo",
"PolicyNames": [
"all_foo",
"all_bar"
]
},
{
"GroupName": "bar",
"PolicyNames": [
"all_bar"
]
}
]
},
{
"groups": [
{
"GroupId": "999",
"GroupName": "baz"
}
]
}
]
}
]
}
此修订版产生
{
"account": "123456789012",
"regions": [
{
"region": "one",
"services": [
{
"groups": [
{
"GroupId": "123456",
"GroupName": "foo"
},
{
"GroupId": "234567",
"GroupName": "bar"
},
{
"GroupId": "999",
"GroupName": "baz"
}
]
},
{
"group_policies": [
{
"GroupName": "foo",
"PolicyNames": [
"all_foo",
"all_bar"
]
},
{
"GroupName": "bar",
"PolicyNames": [
"all_bar"
]
}
]
}
]
}
]
}
答案 1 :(得分:0)
结合jq add
和jq给我们:
jq '.hits.hits' logs.*.json | jq -s add
将所有日志中的所有hits.hits数组合并。* .json文件合并为一个大数组。