将YAML多行值转换为折叠块标量样式?

时间:2016-12-28 00:50:00

标签: python yaml ruamel.yaml

使用ruamel.yaml我尝试以某种样式获取YAML,更具体地说,单行字符串与:在同一行开始,以及使用折叠标量样式的多行字符串({ {1}} / |)并将行限制为一定数量的字符(自动换行)。

到目前为止,我的尝试受similar function called walk_tree in the sources的影响很大:

|-

然后我得到一个例外:#!/usr/bin/env python import ruamel.yaml from ruamel.yaml.scalarstring import ScalarString, PreservedScalarString def walk_tree(base): from ruamel.yaml.compat import string_types if isinstance(base, dict): for k in base: v = base[k] if isinstance(v, string_types): v = v.replace('\r\n', '\n').replace('\r', '\n').strip() base[k] = ScalarString(v) if '\n' in v else v else: walk_tree(v) elif isinstance(base, list): for idx, elem in enumerate(base): if isinstance(elem, string_types) and '\n' in elem: print(elem) # @Anthon: this print is in the original code as well base[idx] = preserve_literal(elem) else: walk_tree(elem) with open("input.yaml", "r") as fi: inp = fi.read() loader=ruamel.yaml.RoundTripLoader data = ruamel.yaml.load(inp, loader) walk_tree(data) dumper = ruamel.yaml.RoundTripDumper with open("output.yaml", "w") as fo: ruamel.yaml.dump(data, fo, Dumper=dumper, allow_unicode=True) 。如果我将原来的ruamel.yaml.representer.RepresenterError: cannot represent an object: …代码中的ScalarString替换为PreservedScalarString,我就不会例外,但我会再次获得文字块,这不是我想要的。

那么我的代码怎么能被修复以便它可以工作呢?

1 个答案:

答案 0 :(得分:2)

ScalarStringLiteralScalarString的基类,它没有你发现的代表。你应该只使用/保持这个Python字符串,因为它适当地处理特殊字符(引用需要引用的字符串以符合YAML规范)。

假设你有这样的输入:

- 1
- abc: |
    this is a short string scalar with a newline
    in it
- "there are also a multiline\nsequence element\nin this file\nand it is longer"

你可能想做类似的事情:

import ruamel.yaml
from ruamel.yaml.scalarstring import LiteralScalarString, preserve_literal


def walk_tree(base):
    from ruamel.yaml.compat import string_types

    def test_wrap(v):
        v = v.replace('\r\n', '\n').replace('\r', '\n').strip()
        return v if len(v) < 72 else preserve_literal(v)

    if isinstance(base, dict):
        for k in base:
            v = base[k]
            if isinstance(v, string_types) and '\n' in v:
                base[k] = test_wrap(v)
            else:
                walk_tree(v)
    elif isinstance(base, list):
        for idx, elem in enumerate(base):
            if isinstance(elem, string_types) and '\n' in elem:
                base[idx] = test_wrap(elem)
            else:
                walk_tree(elem)

yaml = YAML()

with open("input.yaml", "r") as fi:
    data = yaml.load(fi)

walk_tree(data)

with open("output.yaml", "w") as fo:
    yaml.dump(data, fo)

获得输出:

- 1
- abc: "this is a short string scalar with a newline\nin it"
- |-
  there are also a multiline
  sequence element
  in this file
  and it is longer

一些注意事项:

  • 优先使用LiteralScalarString而不是PreservedScalarString。后者的名字是它唯一保留的字符串类型的残余。
  • 你可能没有使用字符串的序列元素,因为你没有导入preserve_literal,尽管它仍然在复制的代码中使用。
  • 我考虑了&#34;包装&#34;代码转换为test_wrap,由值和元素包装使用,其最大行长度设置为72个字符。
  • data[1]['abc']加载为LiteralScalarString。如果要保留现有的文字样式字符串标量,则应在类型string_types上进行测试之前测试这些标量。
  • 我使用新API,其实例为YAML()
  • 您可能必须将width属性设置为1000,以防止自动换行,如果您将示例中的72增加到默认值80以上。(yaml.width = 1000