---
"main":
"directory":
"options":
"directive": 'options'
"item":
"options": 'Stuff OtherStuff MoreStuff'
"directoryindex":
"item":
"directoryindex": 'stuff.htm otherstuff.htm morestuff.html'
"fileetag":
"item":
"fileetag": 'Stuff'
"keepalive":
"item":
"keepalive": 'Stuff'
"keepalivetimeout":
"item":
"keepalivetimeout": 2
上面是一个YAML文件,我需要解析,编辑然后转储。我选择在python 2.7上使用pyyaml这样做(我需要使用它)。 我已经能够解析和编辑。
但是,由于YAML具有不同的键样式和不同的字符串和整数样式,因此我无法设置默认样式。我现在想知道如何使用pyyaml为不同类型转储不同的样式。
以下是我要解析和编辑的内容
infile = yaml.load(open('yamlfile'))
#Recursive function to loop through nested dictionary
def edit(d,keytoedit=None,newvalue=None):
for key, value in d.iteritems():
if isinstance(value, dict) and key == keytoedit and 'item' in value:
value[value.iterkeys().next()] = {keytoedit:newvalue}
edit(value,keytoedit=keytoedit,newvalue=newvalue)
elif isinstance(value, dict) and keytoedit in value and 'item' not in value and key != 'main':
value[keytoedit] = newvalue
edit(value,keytoedit=keytoedit,newvalue=newvalue)
elif isinstance(value, dict):
edit(value,keytoedit=keytoedit,newvalue=newvalue)
outfile = file('outfile','w')
yaml.dump(infile, outfile,default_flow_style=False)
所以,我想知道如何实现这一点,如果我在yaml.dump中使用default_style,所有类型都会获得相同的样式,我需要遵守原始的YAML文件标准。
我可以用pyyaml以某种方式为特定类型指定样式吗?
编辑: 这是我到目前为止所得到的,缺少的部分是键上的双重qoutes和琴弦上的单个qoutes。
main:
directory:
options:
directive: options
item:
options: Stuff OtherStuff MoreStuff
directoryindex:
item:
directoryindex: stuff.html otherstuff.htm morestuff.html
fileetag:
item:
fileetag: Stuff
keepalive:
item:
keepalive: 'On'
keepalivetimeout:
item:
keepalivetimeout: 2
答案 0 :(得分:1)
对于某些正常值的yaml.dump()
,您至少可以保留各种元素的原始流/块样式。
你需要的是一个在读取数据时保存flow / bcock样式信息的加载器,子类化具有样式的常规类型(映射/ dicts,resp。序列/列表),以便它们的行为类似于通常返回的python构造由装载机,但附有样式信息。然后在使用yaml.dump
的路上,您提供了一个自定义转储程序,它将此样式信息考虑在内。
我使用名为ruamel.yaml的增强版PyYAML中的普通yaml.dump
,但是RoundTripDumper
具有特殊的加载器和转储器类RoundTripLoader
yaml.load
})保留流/块样式(以及文件中可能包含的任何注释:
import ruamel.yaml as yaml
infile = yaml.load(open('yamlfile'), Loader=yaml.RoundTripLoader)
for key, value in infile['main'].items():
if key == 'keepalivetimeout':
item = value['item']
item['keepalivetimeout'] = 400
print yaml.dump(infile, Dumper=yaml.RoundTripDumper)
给你:
main:
directory:
options:
directive: options
item:
options: Stuff OtherStuff MoreStuff
directoryindex:
item:
directoryindex: stuff.htm otherstuff.htm morestuff.html
fileetag:
item:
fileetag: Stuff
keepalive:
item:
keepalive: Stuff
keepalivetimeout:
item:
keepalivetimeout: 400
如果您无法安装ruamel.yaml
,您可以从my repository中提取代码并将其包含在您的代码中,自我开始处理此代码以来,AFAIK PyYAML尚未升级。
我目前不保留标量上多余的引用,但我确实保留了chomping信息(对于以'|'开头的多行语句。这些信息在YAML文件的输入处理中很早就被抛出了需要保留多个更改。
由于您似乎对键和值字符串标量有不同的引号,您可以通过覆盖process_scalar
(emitter.py中的Emitter的一部分)来根据字符串添加引号来实现所需的输出标量是一个关键与否,是一个整数与否:
import ruamel.yaml as yaml
# the scalar emitter from emitter.py
def process_scalar(self):
if self.analysis is None:
self.analysis = self.analyze_scalar(self.event.value)
if self.style is None:
self.style = self.choose_scalar_style()
split = (not self.simple_key_context)
# VVVVVVVVVVVVVVVVVVVV added
try:
x = int(self.event.value) # might need to expand this
except:
# we have string
if split:
self.style = "'"
else:
self.style = '"'
# ^^^^^^^^^^^^^^^^^^^^
# if self.analysis.multiline and split \
# and (not self.style or self.style in '\'\"'):
# self.write_indent()
if self.style == '"':
self.write_double_quoted(self.analysis.scalar, split)
elif self.style == '\'':
self.write_single_quoted(self.analysis.scalar, split)
elif self.style == '>':
self.write_folded(self.analysis.scalar)
elif self.style == '|':
self.write_literal(self.analysis.scalar)
else:
self.write_plain(self.analysis.scalar, split)
self.analysis = None
self.style = None
if self.event.comment:
self.write_post_comment(self.event)
infile = yaml.load(open('yamlfile'), Loader=yaml.RoundTripLoader)
for key, value in infile['main'].items():
if key == 'keepalivetimeout':
item = value['item']
item['keepalivetimeout'] = 400
dd = yaml.RoundTripDumper
dd.process_scalar = process_scalar
print '---'
print yaml.dump(infile, Dumper=dd)
给你:
---
"main":
"directory":
"options":
"directive": 'options'
"item":
"options": 'Stuff OtherStuff MoreStuff'
"directoryindex":
"item":
"directoryindex": 'stuff.htm otherstuff.htm morestuff.html'
"fileetag":
"item":
"fileetag": 'Stuff'
"keepalive":
"item":
"keepalive": 'Stuff'
"keepalivetimeout":
"item":
"keepalivetimeout": 400
这与你的要求非常接近。