解析YAML,返回行号

时间:2012-11-10 03:51:05

标签: python parsing yaml pyyaml

我正在使用YAML数据生成文档生成器,该数据将指定生成每个项目的YAML文件的哪一行。做这个的最好方式是什么?所以,如果YAML文件是这样的:

- key1: item 1
  key2: item 2
- key1: another item 1
  key2: another item 2

我想要这样的事情:

[
     {'__line__': 1, 'key1': 'item 1', 'key2': 'item 2'},
     {'__line__': 3, 'key1': 'another item 1', 'key2': 'another item 2'},
]

我目前正在使用PyYAML,但如果我可以在Python中使用它,那么任何其他库都可以。

3 个答案:

答案 0 :(得分:10)

我是通过向Composer.compose_nodeConstructor.construct_mapping添加挂钩来实现的:

import yaml
from yaml.composer import Composer
from yaml.constructor import Constructor

def main():
    loader = yaml.Loader(open('data.yml').read())
    def compose_node(parent, index):
        # the line number where the previous token has ended (plus empty lines)
        line = loader.line
        node = Composer.compose_node(loader, parent, index)
        node.__line__ = line + 1
        return node
    def construct_mapping(node, deep=False):
        mapping = Constructor.construct_mapping(loader, node, deep=deep)
        mapping['__line__'] = node.__line__
        return mapping
    loader.compose_node = compose_node
    loader.construct_mapping = construct_mapping
    data = loader.get_single_data()
    print(data)

答案 1 :(得分:4)

如果您使用ruamel.yaml> = 0.9(我是作者),并使用RoundTripLoader,您可以访问收集项上的属性lc以获取行和他们在源YAML中开始的列:

def test_item_04(self):
    data = load("""
     # testing line and column based on SO
     # http://stackoverflow.com/questions/13319067/
     - key1: item 1
       key2: item 2
     - key3: another item 1
       key4: another item 2
        """)
    assert data[0].lc.line == 2
    assert data[0].lc.col == 2
    assert data[1].lc.line == 4
    assert data[1].lc.col == 2

(行和列从0开始计数)。

This answer显示如何在加载过程中将lc属性添加到字符串类型。

答案 2 :(得分:3)

这是puzzlet's answer的改进版本:

import yaml
from yaml.loader import SafeLoader

class SafeLineLoader(SafeLoader):
    def construct_mapping(self, node, deep=False):
        mapping = super(SafeLineLoader, self).construct_mapping(node, deep=deep)
        # Add 1 so line numbering starts at 1
        mapping['__line__'] = node.start_mark.line + 1
        return mapping

您可以像这样使用它:

data = yaml.load(whatever, Loader=SafeLineLoader)