Pyyaml - 为键,整数和字符串使用不同的样式

时间:2015-05-12 18:13:28

标签: python python-2.7 yaml pyyaml

--- 
"main": 
  "directory": 
    "options": 
      "directive": 'options'
      "item": 
        "options": 'Stuff OtherStuff MoreStuff'
  "directoryindex": 
    "item": 
      "directoryindex": 'stuff.htm otherstuff.htm morestuff.html'
  "fileetag": 
    "item": 
      "fileetag": 'Stuff'
  "keepalive": 
    "item": 
      "keepalive": 'Stuff'
  "keepalivetimeout": 
    "item": 
      "keepalivetimeout": 2

上面是一个YAML文件,我需要解析,编辑然后转储。我选择在python 2.7上使用pyyaml这样做(我需要使用它)。 我已经能够解析和编辑。

但是,由于YAML具有不同的键样式和不同的字符串和整数样式,因此我无法设置默认样式。我现在想知道如何使用pyyaml为不同类型转储不同的样式。

以下是我要解析和编辑的内容

infile = yaml.load(open('yamlfile'))

#Recursive function to loop through nested dictionary
def edit(d,keytoedit=None,newvalue=None):
  for key, value in d.iteritems():
    if isinstance(value, dict) and key == keytoedit and 'item' in value:
      value[value.iterkeys().next()] = {keytoedit:newvalue}
      edit(value,keytoedit=keytoedit,newvalue=newvalue)
    elif isinstance(value, dict) and keytoedit in value and 'item' not in value and key != 'main':
      value[keytoedit] = newvalue
      edit(value,keytoedit=keytoedit,newvalue=newvalue)
    elif isinstance(value, dict):
      edit(value,keytoedit=keytoedit,newvalue=newvalue)

outfile = file('outfile','w')
yaml.dump(infile, outfile,default_flow_style=False)

所以,我想知道如何实现这一点,如果我在yaml.dump中使用default_style,所有类型都会获得相同的样式,我需要遵守原始的YAML文件标准。

我可以用pyyaml以某种方式为特定类型指定样式吗?

编辑: 这是我到目前为止所得到的,缺少的部分是键上的双重qoutes和琴弦上的单个qoutes。

main:
  directory:
    options:
      directive: options
      item:
        options: Stuff OtherStuff MoreStuff
  directoryindex:
    item:
      directoryindex: stuff.html otherstuff.htm morestuff.html
  fileetag:
    item:
      fileetag: Stuff
  keepalive:
    item:
      keepalive: 'On'
  keepalivetimeout:
    item:
      keepalivetimeout: 2

1 个答案:

答案 0 :(得分:1)

对于某些正常值的yaml.dump(),您至少可以保留各种元素的原始流/块样式。

你需要的是一个在读取数据时保存flow / bcock样式信息的加载器,子类化具有样式的常规类型(映射/ dicts,resp。序列/列表),以便它们的行为类似于通常返回的python构造由装载机,但附有样式信息。然后在使用yaml.dump的路上,您提供了一个自定义转储程序,它将此样式信息考虑在内。

我使用名为ruamel.yaml的增强版PyYAML中的普通yaml.dump,但是RoundTripDumper具有特殊的加载器和转储器类RoundTripLoader yaml.load })保留流/块样式(以及文件中可能包含的任何注释:

import ruamel.yaml as yaml

infile = yaml.load(open('yamlfile'), Loader=yaml.RoundTripLoader)

for key, value in infile['main'].items():
    if key == 'keepalivetimeout':
        item = value['item']
        item['keepalivetimeout'] = 400

print yaml.dump(infile, Dumper=yaml.RoundTripDumper)

给你:

main:
  directory:
    options:
      directive: options
      item:
        options: Stuff OtherStuff MoreStuff
  directoryindex:
    item:
      directoryindex: stuff.htm otherstuff.htm morestuff.html
  fileetag:
    item:
      fileetag: Stuff
  keepalive:
    item:
      keepalive: Stuff
  keepalivetimeout:
    item:
      keepalivetimeout: 400

如果您无法安装ruamel.yaml,您可以从my repository中提取代码并将其包含在您的代码中,自我开始处理此代码以来,AFAIK PyYAML尚未升级。

我目前不保留标量上多余的引用,但我确实保留了chomping信息(对于以'|'开头的多行语句。这些信息在YAML文件的输入处理中很早就被抛出了需要保留多个更改。

由于您似乎对键和值字符串标量有不同的引号,您可以通过覆盖process_scalar(emitter.py中的Emitter的一部分)来根据字符串添加引号来实现所需的输出标量是一个关键与否,是一个整数与否:

import ruamel.yaml as yaml

# the scalar emitter from emitter.py
def process_scalar(self):
    if self.analysis is None:
        self.analysis = self.analyze_scalar(self.event.value)
    if self.style is None:
        self.style = self.choose_scalar_style()
    split = (not self.simple_key_context)
    # VVVVVVVVVVVVVVVVVVVV added
    try:
        x = int(self.event.value)  # might need to expand this
    except:
        # we have string
        if split:
            self.style = "'"
        else:
            self.style = '"'
    # ^^^^^^^^^^^^^^^^^^^^
    # if self.analysis.multiline and split    \
    #         and (not self.style or self.style in '\'\"'):
    #     self.write_indent()
    if self.style == '"':
        self.write_double_quoted(self.analysis.scalar, split)
    elif self.style == '\'':
        self.write_single_quoted(self.analysis.scalar, split)
    elif self.style == '>':
        self.write_folded(self.analysis.scalar)
    elif self.style == '|':
        self.write_literal(self.analysis.scalar)
    else:
        self.write_plain(self.analysis.scalar, split)
    self.analysis = None
    self.style = None
    if self.event.comment:
        self.write_post_comment(self.event)


infile = yaml.load(open('yamlfile'), Loader=yaml.RoundTripLoader)

for key, value in infile['main'].items():
    if key == 'keepalivetimeout':
        item = value['item']
        item['keepalivetimeout'] = 400

dd = yaml.RoundTripDumper
dd.process_scalar = process_scalar

print '---'
print yaml.dump(infile, Dumper=dd)

给你:

---
"main":
  "directory":
    "options":
      "directive": 'options'
      "item":
        "options": 'Stuff OtherStuff MoreStuff'
  "directoryindex":
    "item":
      "directoryindex": 'stuff.htm otherstuff.htm morestuff.html'
  "fileetag":
    "item":
      "fileetag": 'Stuff'
  "keepalive":
    "item":
      "keepalive": 'Stuff'
  "keepalivetimeout":
    "item":
      "keepalivetimeout": 400

这与你的要求非常接近。