Python解析嵌套ordereddicts

时间:2018-09-12 20:42:58

标签: python parsing yaml ordereddict

如果文件是这样的:

OrderedDict
([
 ('activateable', False),
 ('Thisfield', 
    [OrderedDict ([ ('autoNumber', False),  ('name', ‘col_1’),  (‘amount’, ‘10’)]),
    [OrderedDict ([ ('autoNumber', False),  ('name', ‘col_2’),  (‘amount’, ‘10’)])
 ),
('Thisfield2', 
    [OrderedDict ([ ('autoNumber', False),  ('name', ‘col_1’),  (‘amount’, ‘10’)]),
    [OrderedDict ([ ('autoNumber', False),  ('name', ‘col_2’),  (‘amount’, ‘10’)])
 ),
('Thisfield3', 
    [OrderedDict ([ ('autoNumber', False),  ('name', ‘col_1’),  (‘amount’, ‘10’)]),
    [OrderedDict ([ ('autoNumber', False),  ('name', ‘col_2’),  (‘amount’, ‘10’)])
 )
 ('pin', False)
])

...而我只想返回'Thisfield1,Thisfield2,Thisfield3'?

1 个答案:

答案 0 :(得分:0)

起初我以为您输入的是Python,但不是:

  • 它具有Unicode左右引号(U + 2018 / U + 2019)
  • 它有不平衡的方括号
  • ('pin', False)之前至少需要逗号

因此,考虑到您问题的标签,它必须是YAML文档, 这意味着它具有单个多行普通标量作为内容。和 当您使用YAML解析器加载该代码时,您将获得整个标量 加载为没有换行符的单个字符串:

OrderedDict ([ ('activateable', False), ('Thisfield', [OrderedDict ([ ('autoNumber', False),  ('name', ‘col_1’),  (‘amount’, ‘10’)]), [OrderedDict ([ ('autoNumber', False),  ('name', ‘col_2’),  (‘amount’, ‘10’)]) ), ('Thisfield2', [OrderedDict ([ ('autoNumber', False),  ('name', ‘col_1’),  (‘amount’, ‘10’)]), [OrderedDict ([ ('autoNumber', False),  ('name', ‘col_2’),  (‘amount’, ‘10’)]) ), ('Thisfield3', [OrderedDict ([ ('autoNumber', False),  ('name', ‘col_1’),  (‘amount’, ‘10’)]), [OrderedDict ([ ('autoNumber', False),  ('name', ‘col_2’),  (‘amount’, ‘10’)]) ) ('pin', False) ])

不像原始输入文件那样容易解析。

因此,probalby可以轻松地“解析”输入行:

def get_thisfields(fp):
    vals = []
    for line in fp:
        line = line.strip()
        if not line.startswith(u"('This"):
            continue
        vals.append(line.split("'")[1])
    return ', '.join(vals)

print(get_thisfields(open('input.yaml')))

鉴于您输入的“ YAML”文件,get_thisfields()返回:

Thisfield, Thisfield2, Thisfield3

根据您的要求。