我收到以下输入文件:
- input.json:
[
{"ID":"aaa_12301248","time_CET":"00:00:00","VALUE":10,"FLAG":"0"},
{"ID":"aaa_12301248","time_CET":"00:15:00","VALUE":18,"FLAG":"0"},
{"ID":"aaa_12301248","time_CET":"00:30:00","VALUE":160,"FLAG":"0"},
{"ID":"bbb_0021122","time_CET":"00:00:00","VALUE":null,"FLAG":"?"},
{"ID":"bbb_0021122","time_CET":"00:15:00","VALUE":null,"FLAG":"?"},
{"ID":"bbb_0021122","time_CET":"00:30:00","VALUE":22,"FLAG":"0"},
{"ID":"ccc_0021122","time_CET":"00:00:00","VALUE":null,"FLAG":"?"},
{"ID":"ccc_0021122","time_CET":"00:15:00","VALUE":null,"FLAG":"?"},
{"ID":"ccc_0021122","time_CET":"00:30:00","VALUE":20,"FLAG":"0"},
{"ID":"ddd_122455","time_CET":"00:00:00","VALUE":null,"FLAG":"?"},
{"ID":"ddd_122455","time_CET":"00:15:00","VALUE":null,"FLAG":"?"},
{"ID":"ddd_122455","time_CET":"00:30:00","VALUE":null,"FLAG":"?"},
]
如您所见,有一些有效值(FLAG:0)和一些无效值(FLAG:“?”)。 现在我有一个看起来像这样的文件(每个ID一个):
aaa.json:
[
{"ID":"aaa_12301248","time_CET":"00:00:00","VALUE":10,"FLAG":"0"},
{"ID":"aaa_12301248","time_CET":"00:15:00","VALUE":null,"FLAG":"?"},
{"ID":"aaa_12301248","time_CET":"00:55:00","VALUE":45,"FLAG":"0"}
]
如您所见,对象1与input.json中的对象相同,但对象2无效(FLAG:“?”)。这就是为什么对象2必须被input.json中的正确对象替换(使用VALUE:18)。 可以通过“time_CET”和“ID”元素来标识对象。
此外,input.json中将有新对象,它们不属于aaa.json等。这些对象应该添加到数组中,并且应保留aaa.json中的有效对象。
最后,aaa.json应该是这样的:
[
{"ID":"aaa_12301248","time_CET":"00:00:00","VALUE":10,"FLAG":"0"},
{"ID":"aaa_12301248","time_CET":"00:15:00","VALUE":18,"FLAG":"0"},
{"ID":"aaa_12301248","time_CET":"00:30:00","VALUE":160,"FLAG":"0"},
{"ID":"aaa_12301248","time_CET":"00:55:00","VALUE":45,"FLAG":"0"}
]
所以,总结一下:
我不确定是否可以使用这样的命令一次完成此操作,因为输出必须返回到正确的id文件(aaa,bbb ccc.json):
jq --argfile aaa aaa.json --argfile bbb bbb.json .... -f prog.jq input.json
问题是,标识符(aaa,bbb,ccc等)之后的数字可能会改变。因此,为了确保将对象添加到正确的文件/数组中,需要这样的语句:
if (."ID"|contains("aaa")) then ....
或者使用不同的输入参数多次运行程序更好?我不确定..
提前谢谢!!
答案 0 :(得分:1)
这是一种方法
#!/bin/bash
# usage: update.sh input.json aaa.json bbb.json....
# updates each of aaa.json bbb.json....
input_json="$1"
shift
for i in "$@"; do
jq -M --argfile input_json "$input_json" '
# functions to restrict input.json to keys of current xxx.json file
def prefix: input_filename | split(".")[0];
def selectprefix: select(.ID | startswith(prefix));
# functions to build and probe a lookup table
def pk: [.ID, .time_CET];
def lookup($t;$k): $t | getpath($k);
def lookup($t): lookup($t;pk);
def organize(s): reduce s as $r ({}; setpath($r|pk; $r));
# functions to identify objects in input.json missing from xxx.json
def pks: paths | select(length==2);
def missing($t1;$t2): [$t1|pks] - [$t2|pks] | .[];
def getmissing($t1;$t2): [ missing($t1;$t2) as $p | lookup($t1;$p)];
# main routine
organize(.[]) as $xxx
| organize($input_json[] | selectprefix) as $inp
| map(if .FLAG != "?" then . else . += lookup($inp) end)
| . + getmissing($inp;$xxx)
' "$i" | sponge "$i"
done
该脚本在循环中使用jq来读取和更新每个aaa.json
...文件。
过滤器会创建临时对象,以便按[ID,time_CET]
查找值,使用FLAG更新aaa.json
中的任何值==“?”最后添加input.json
中遗漏的aaa.json
中的任何值。
input.json
的临时查找表使用input_filename
,以便只包含以与当前处理文件的名称匹配的前缀开头的键。
示例运行:
$ ./update.sh input.json aaa.json
运行后 aaa.json
:
[
{
"ID": "aaa_12301248",
"time_CET": "00:00:00",
"VALUE": 10,
"FLAG": "0"
},
{
"ID": "aaa_12301248",
"time_CET": "00:15:00",
"VALUE": 18,
"FLAG": "0"
},
{
"ID": "aaa_12301248",
"time_CET": "00:55:00",
"VALUE": 45,
"FLAG": "0"
},
{
"ID": "aaa_12301248",
"time_CET": "00:30:00",
"VALUE": 160,
"FLAG": "0"
}
]