我希望能够转储包含长字符串的字典,这些字符串我希望在块样式中具有可读性。例如:
foo: |
this is a
block literal
bar: >
this is a
folded block
PyYAML支持使用此样式加载文档,但我似乎找不到以这种方式转储文档的方法。我错过了什么吗?
答案 0 :(得分:23)
import yaml
class folded_unicode(unicode): pass
class literal_unicode(unicode): pass
def folded_unicode_representer(dumper, data):
return dumper.represent_scalar(u'tag:yaml.org,2002:str', data, style='>')
def literal_unicode_representer(dumper, data):
return dumper.represent_scalar(u'tag:yaml.org,2002:str', data, style='|')
yaml.add_representer(folded_unicode, folded_unicode_representer)
yaml.add_representer(literal_unicode, literal_unicode_representer)
data = {
'literal':literal_unicode(
u'by hjw ___\n'
' __ /.-.\\\n'
' / )_____________\\\\ Y\n'
' /_ /=== == === === =\\ _\\_\n'
'( /)=== == === === == Y \\\n'
' `-------------------( o )\n'
' \\___/\n'),
'folded': folded_unicode(
u'It removes all ordinary curses from all equipped items. '
'Heavy or permanent curses are unaffected.\n')}
print yaml.dump(data)
结果:
folded: >
It removes all ordinary curses from all equipped items. Heavy or permanent curses
are unaffected.
literal: |
by hjw ___
__ /.-.\
/ )_____________\\ Y
/_ /=== == === === =\ _\_
( /)=== == === === == Y \
`-------------------( o )
\___/
为了完整性,还应该有str实现,但我会变懒: - )
答案 1 :(得分:18)
pyyaml
支持转储文字或折叠块。
Representer.add_representer
定义类型:
class folded_str(str): pass
class literal_str(str): pass
class folded_unicode(unicode): pass
class literal_unicode(str): pass
然后您可以为这些类型定义代表。 请注意,虽然Gary的solution适用于unicode,但您可能需要更多工作才能使字符串正常工作(请参阅implementation of represent_str)。
def change_style(style, representer):
def new_representer(dumper, data):
scalar = representer(dumper, data)
scalar.style = style
return scalar
return new_representer
import yaml
from yaml.representer import SafeRepresenter
# represent_str does handle some corner cases, so use that
# instead of calling represent_scalar directly
represent_folded_str = change_style('>', SafeRepresenter.represent_str)
represent_literal_str = change_style('|', SafeRepresenter.represent_str)
represent_folded_unicode = change_style('>', SafeRepresenter.represent_unicode)
represent_literal_unicode = change_style('|', SafeRepresenter.represent_unicode)
然后您可以将这些代表添加到默认转储程序中:
yaml.add_representer(folded_str, represent_folded_str)
yaml.add_representer(literal_str, represent_literal_str)
yaml.add_representer(folded_unicode, represent_folded_unicode)
yaml.add_representer(literal_unicode, represent_literal_unicode)
...并测试它:
data = {
'foo': literal_str('this is a\nblock literal'),
'bar': folded_unicode('this is a folded block'),
}
print yaml.dump(data)
结果:
bar: >-
this is a folded block
foo: |-
this is a
block literal
default_style
如果您有兴趣让所有字符串都遵循默认样式,您还可以使用default_style
关键字参数,例如:
>>> data = { 'foo': 'line1\nline2\nline3' }
>>> print yaml.dump(data, default_style='|')
"foo": |-
line1
line2
line3
或折叠文字:
>>> print yaml.dump(data, default_style='>')
"foo": >-
line1
line2
line3
或双引文字:
>>> print yaml.dump(data, default_style='"')
"foo": "line1\nline2\nline3"
以下是您可能不期望的一些示例:
data = {
'foo': literal_str('this is a\nblock literal'),
'bar': folded_unicode('this is a folded block'),
'non-printable': literal_unicode('this has a \t tab in it'),
'leading': literal_unicode(' with leading white spaces'),
'trailing': literal_unicode('with trailing white spaces '),
}
print yaml.dump(data)
结果:
bar: >-
this is a folded block
foo: |-
this is a
block literal
leading: |2-
with leading white spaces
non-printable: "this has a \t tab in it"
trailing: "with trailing white spaces "
请参阅YAML规范了解转义字符(Section 5.7):
请注意,转义序列仅在双引号标量中解释。在所有其他标量样式中,“\”字符没有特殊含义,并且不可打印的字符不可用。
如果您想保留不可打印的字符(例如TAB),则需要使用双引号标量。如果你能够转储带有文字样式的标量,并且那里有一个不可打印的字符(例如TAB),那么你的YAML转储器是不合规的。
E.g。 pyyaml
检测到不可打印的字符\t
并使用双引号样式,即使指定了默认样式:
>>> data = { 'foo': 'line1\nline2\n\tline3' }
>>> print yaml.dump(data, default_style='"')
"foo": "line1\nline2\n\tline3"
>>> print yaml.dump(data, default_style='>')
"foo": "line1\nline2\n\tline3"
>>> print yaml.dump(data, default_style='|')
"foo": "line1\nline2\n\tline3"
规范中的另一些有用信息是:
所有前导和尾随空格字符都从内容中排除
这意味着如果您的字符串确实具有前导或尾随空格,则除了双引号之外,它们不会以标量样式保留。因此,pyyaml
会尝试检测标量中的内容并强制使用双引号样式。
答案 2 :(得分:0)
这可以相对容易地完成,唯一的“障碍”是如何
指出字符串中哪个空格需要
表示为折叠的标量,需要变为折叠。文字标量
有包含该信息的显式换行符,但这不能
用于折叠标量,因为它们可以包含显式换行符,例如在
情况下有领先的空格,并且最后也需要换行
以免被剥夺性的砍伐指标(>-
)
import sys
import ruamel.yaml
folded = ruamel.yaml.scalarstring.FoldedScalarString
literal = ruamel.yaml.scalarstring.LiteralScalarString
yaml = ruamel.yaml.YAML()
data = dict(
foo=literal('this is a\nblock literal\n'),
bar=folded('this is a folded block\n'),
)
data['bar'].fold_pos = [data['bar'].index(' folded')]
yaml.dump(data, sys.stdout)
给出:
foo: |
this is a
block literal
bar: >
this is a
folded block
fold_pos
属性期望可逆的可重复的表示位置
空格,指示要折叠的位置。
如果您的字符串中从来没有竖线字符('|'), 可以做类似的事情:
import re
s = 'this is a|folded block\n'
sf = folded(s.replace('|', ' ')) # need to have a space!
sf.fold_pos = [x.start() for x in re.finditer('\|', s)] # | is special in re, needs escaping
data = dict(
foo=literal('this is a\nblock literal\n'),
bar=sf, # need to have a space
)
yaml = ruamel.yaml.YAML()
yaml.dump(data, sys.stdout)
这也可以准确给出您期望的输出