如何使用python基于键的值将YAML解析为多个compose.yaml

时间:2019-04-01 06:50:33

标签: python-3.x yaml pyyaml ruamel.yaml

我正在解析YAML,并将其分解为不同的YAML文件。我使用PyYAML的构造函数来实现它,但是效果很差。

这是我项目的一部分,我需要根据收到的Yaml文件中密钥的值来解析并拆分为多个不同的Yaml文件。

我收到的

yaml文件看起来像这样

testname: testname
testall:
    test1:
        name: name1
        location: 0
    test2: 
        name: name2
        location: 2
    test3: 
        name: name3
        location: 0
    test4: 
        name: name4
        location: 2
    ...
locations:
    - 0
    - 2
    - ...  

我想解析它并按如下所示按设备划分

# location0.yaml
testname:test
tests:
    test1:
        name:test1
        location:0
    test3: 
        name: test3
        location: 0
# location2.yaml
testname:test
tests:
    test2:
        name:test2
        location:0
    test4: 
        name: test4
        location: 0

如何解析以上表格?

1 个答案:

答案 0 :(得分:0)

尽管您可以使用PyYAML进行此操作,但您必须限制 自己使用YAML 1.1。对于这种读-修改-写,您应该 使用ruamel.yaml(免责声明:我是该软件包的作者)。不 仅支持YAML 1.2,它还会保留所有注释,标签 和锚名称,以防它们出现在您的来源中并可以保留 如果需要,可以在标量,文字和折叠样式等周围加上引号。

还要注意,您的输出是无效的YAML,您不能使用 多行普通(即,无引号)标量是(块样式)的关键 映射。您将必须编写:

"testname:test
tests":

但是我认为您的意思是成为根级别映射的两个关键:

testname: test
tests:

假设您输入的内容是input.yaml

testname: testname
testall:
    test1:
        name: name1    # this is just the first name
        location: 0
    test2: 
        name: "name2"  # quotes added for demo purposes
        location: 2
    test3: 
        name: name3    # as this has the same location as name1 
        location: 0    # these should be together
    test4: 
        name: name4    # and this one goes with name2
        location: 2
locations:
    - 0
    - 2

您可以这样做:

import sys
from pathlib import Path
import ruamel.yaml

in_file = Path('input.yaml')


yaml = ruamel.yaml.YAML()
yaml.indent(mapping=4, sequence=6, offset=4)  # this matches your input
yaml.preserve_quotes = True
data = yaml.load(in_file)

for loc in data['locations']:
    out_name = f'location{loc}.yaml'
    tests = {}
    d = ruamel.yaml.comments.CommentedMap(dict(testname="test", tests=tests))
    d.yaml_set_start_comment(out_name)
    testall = data['testall']
    for test in testall:
        if loc == testall[test]['location']:
           tests[test] = testall[test]
           tests[test]['location'] = 0
    # since you set location to zero and this affects data, make sure to remove 
    # the items. This will prevent things from going wrong in case the
    # locations sequence does have zero, but not as its first number
    for key in tests:
         del testall[key]
    yaml.dump(d, Path(out_name))

给出location0.yaml

# location0.yaml
testname: test
tests:
    test1:
        name: name1    # this is just the first name
        location: 0
    test3:
        name: name3    # as this has the same location as name1 
        location: 0    # these should be together

location2.yaml

# location2.yaml
testname: test
tests:
    test2:
        name: "name2"  # quotes added for demo purposes
        location: 0
    test4:
        name: name4    # and this one goes with name2
        location: 0