我想将两个JSON对象合并为一个新对象。我尝试将jsonmerge与完整的json模式一起使用,但是我不知道如何正确设置合并策略。我很确定可以做到。
代码:
import json
from jsonmerge import Merger
from jsonschema import validate
full_build = {
"captures": [
{
"compiler": "gnu",
"executable": "gcc",
"cmd": ["gcc", "options", "file1.cpp"],
"cwd": ".",
"env": ["A=1", "B=2"],
},
{
"compiler": "gnu",
"executable": "gcc",
"cmd": ["gcc", "options", "file2.cpp"],
"cwd": ".",
"env": ["A=1", "B=2"],
}
]
}
incremental_build = {
"captures": [
{
"compiler": "gnu",
"executable": "gcc",
"cmd": ["gcc", "new options", "file2.cpp"],
"cwd": ".",
"env": ["A=1", "NEW=2"],
},
{
"compiler": "gnu",
"executable": "gcc",
"cmd": ["gcc", "options", "file3.cpp"],
"cwd": ".",
"env": ["A=1", "B=2"],
}
]
}
schema = {
"type" : "object",
"properties" : {
"captures": {
"type" : "array",
"items" : {
"type" : "object",
"properties" : {
"cmd" : {
"type" : "array",
"items" : {"type" : "string"},
},
"compiler" : {"type" : "string"},
"cwd" : {"type" : "string"},
"env" : {
"type" : "array",
"items" : {"type" : "string"},
},
"executable" : {"type" : "string"},
}
}
}
}
}
validate(instance=full_build, schema=schema)
mergeSchema = schema
merger = Merger(mergeSchema)
result = merger.merge(full_build, incremental_build)
print(json.dumps(result, indent=3))
结果:
{
"captures": [
{
"compiler": "gnu",
"executable": "gcc",
"cmd": [
"gcc",
"options",
"file3.cpp"
],
"cwd": ".",
"env": [
"A=1",
"B=2"
]
}
]
}
预期结果:
{
"captures": [
{
"compiler": "gnu",
"executable": "gcc",
"cmd": [
"gcc",
"options",
"file1.cpp"
],
"cwd": ".",
"env": [
"A=1",
"B=2"
]
},
{
"compiler": "gnu",
"executable": "gcc",
"cmd": [
"gcc",
"new options",
"file2.cpp"
],
"cwd": ".",
"env": [
"A=1",
"NEW=2"
]
},
{
"compiler": "gnu",
"executable": "gcc",
"cmd": [
"gcc",
"options",
"file3.cpp"
],
"cwd": ".",
"env": [
"A=1",
"B=2"
]
}
]
}
还有更多需要考虑的事情(例如比以前有更多或更少的选项/环境变量),但我认为我会设法完成任务。 我真的不想硬编码。
不,我不能更改json的结构:(。
背景:我想合并SonarQube构建包装器输出,因为我不想做一个完整的构建以将所有文件放入包装器输出。
答案 0 :(得分:2)
It seems you don't really need any complex merge operation at all. You basically want to combine the ‘captures’ lists from both structures into a new structure which contains all of them. This can be achieved by making a copy and simply extending the list afterwards:
full_build = ...
incremental_build = ...
combined = copy.deepcopy(full_build)
combined['captures'].extend(incremental_build['captures'])
If you want to ‘deduplicate’ based on some attribute, e.g. the file name, you can use something like this:
def get_filename_from_capture(cmd):
return cmd["cmd"][-1]
all_captures = full_build["captures"] + incremental_build["captures"]
captures_by_filename = {
get_filename_from_capture(capture): capture for capture in all_captures
}
combined = copy.deepcopy(full_build)
combined["captures"] = list(captures_by_filename.values())
答案 1 :(得分:2)
您有两个JSON对象数组,您想基于它们构造一个数组。
在您的示例中,似乎有时您希望incremental_build
中的对象覆盖full_build
中的对象(在最终数组中只有一个对象提到file2.cpp
),但是有时您不需要(file3.cpp
的对象不会用file1.cpp
覆盖的对象)。
您没有指定确切的规则,但是我猜您要匹配的文件名。我还猜测您想将数组元素本身视为不可变的,并且不想在文件名匹配时将它们进一步合并在一起。
要实现此目的,可以使用以下架构:
schema = {
"properties" : {
"captures": {
"mergeStrategy": "arrayMergeById",
"mergeOptions": {
"idRef": "/cmd/2"
},
"items": {
"mergeStrategy": "overwrite"
}
}
}
}
merger = Merger(schema)
result = merger.merge(full_build, incremental_build)
您不需要完整的架构,除非您还想验证JSON。 jsonmerge
本身仅关心合并策略信息。
以上架构指定应使用arrayMergeById
策略合并顶级对象中属性 captures 下的数组。此策略根据idRef
引用所指向的值合并数组的元素。在您的示例中,文件名是cmd
属性的第三个元素(JSON指针使用基于零的索引)。
arrayMergeById
根据匹配的数组元素自己的模式进行合并。默认情况下,它们将使用objectMerge
策略进行合并。在incremental_build
中的元素缺少匹配的full_build
元素中存在的属性的情况下,这将产生错误的结果。因此,以上架构还为captures
数组的所有项目指定了 overwrite 策略。