是否可以使用PyYAML来读取用" YAML前端问题编写的文本文件"挡在里面?

时间:2014-09-12 18:28:38

标签: python yaml pyyaml

对不起,我对YAML和PyYAML都知之甚少,但我很同意支持一个配置文件的想法,这个配置文件是用“Jekyll”(http://jekyllrb.com/docs/frontmatter/)所使用的相同风格编写的,AFAIK有这些“YAML Front Matter”块对我来说看起来非常酷和性感 所以我在我的计算机上安装了PyYAML,并用这个文本块写了一个小文件:

---
First Name: John
Second Name: Doe
Born: Yes
---

Lorem ipsum dolor sit amet, consectetur adipiscing elit,  
sed do eiusmod tempor incididunt ut labore et dolore magna  
aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco 
laboris nisi ut aliquip ex ea commodo consequat.

然后我尝试使用以下代码阅读Python 3.4和PyYAML的文本文件:

import yaml

stream = open("test.yaml")
a = stream.read()
b = yaml.load(a)

但显然它不起作用,Python显示此错误消息:

Traceback (most recent call last):
  File "<pyshell#62>", line 1, in <module>
    b = yaml.load(a)
  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/yaml/__init__.py", line 72, in load
    return loader.get_single_data()
  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/yaml/constructor.py", line 35, in get_single_data
    node = self.get_single_node()
  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/yaml/composer.py", line 43, in get_single_node
    event.start_mark)
yaml.composer.ComposerError: expected a single document in the stream
  in "<unicode string>", line 2, column 1:
    First Name: John
    ^
but found another document
  in "<unicode string>", line 5, column 1:
    ---
    ^

请问你能帮帮我吗? 我是否以错误的方式编写了代码,或者这是否意味着PyYAML无法处理YAML前端块? 还有什么我可以尝试用PyYAML做的,或者我是否必须使用正则表达式编写自己的解析器?

非常感谢你的时间!

2 个答案:

答案 0 :(得分:7)

Python yaml库不支持读取嵌入在文档中的yaml。这是一个实用程序函数,它提取yaml文本,因此您可以在读取文件的其余部分之前对其进行解析:

#!/usr/bin/python2.7

import yaml
import sys

def get_yaml(f):
  pointer = f.tell()
  if f.readline() != '---\n':
    f.seek(pointer)
    return ''
  readline = iter(f.readline, '')
  readline = iter(readline.next, '---\n')
  return ''.join(readline)


for filename in sys.argv[1:]:
  with open(filename) as f:
    config = yaml.load(get_yaml(f))
    text = f.read()
    print "TEXT from", filename
    print text
    print "CONFIG from", filename
    print config

答案 1 :(得分:4)

您可以通过调用yaml.load_all()而无需任何自定义解析来完成此操作。这将返回一个生成器,其中第一项是预期的前端物质作为dict,第二项是文档的其余部分作为字符串:

import yaml

with open('some-file-with-front-matter.md') as f:
    front_matter, content = list(yaml.load_all(f))[:2]

如果您只是想要前面的事情,那就更简单了:

import yaml

with open('some-file-with-front-matter.md') as f:
    front_matter = next(yaml.load_all(f))

这可行,因为yaml.load_all() is for loading several YAML documents within the same document---分隔。此外,在从未知来源加载YAML时,请确保采取常规预防措施。