将具有pyparsing的devicetree解析为结构化字典

时间:2017-04-13 20:03:18

标签: python pyparsing device-tree

对于我的C++ RTOS我正在编写一个devicetree解析器" source" Python中的文件(.dts)使用pyparsing模块。我能够将devicetree的结构解析为(嵌套)字典,其中属性名称或节点名称是字典键(字符串),属性值或节点是字典值(字符串或嵌套字典)

假设我有以下示例devicetree结构:

/ {
    property1 = "string1";
    property2 = "string2";
    node1 {
        property11 = "string11";
        property12 = "string12";
        node11 {
            property111 = "string111";
            property112 = "string112";
        };
    };
    node2 {
        property21 = "string21";
        property22 = "string22";
    };
};

我能够将其解析成类似的东西:

{'/': {'node1': {'node11': {'property111': ['string111'], 'property112': ['string112']},
                 'property11': ['string11'],
                 'property12': ['string12']},
       'node2': {'property21': ['string21'], 'property22': ['string22']},
       'property1': ['string1'],
       'property2': ['string2']}}

然而,根据我的需要,我希望这些数据的结构不同。我希望将所有属性作为键"属性"的嵌套字典,并将所有子节点作为键" children"的嵌套字典。原因是设备(特别是节点)有一些"元数据"我想把它作为键值对,这需要我移动实际的内容"节点一级"低级"避免密钥的任何名称冲突。所以我希望上面的例子看起来像这样:

{'/': {
  'properties': {
    'property1': ['string1'],
    'property2': ['string2']
  },
  'nodes': {
    'node1': {
      'properties': {
        'property11': ['string11'],
        'property12': ['string12']
      }
      'nodes': {
        'node11': {
          'properties': {
            'property111': ['string111'],
            'property112': ['string112']
          }
          'nodes': {
          }
        }
      }
    },
    'node2': {
      'properties': {
        'property21': ['string21'],
        'property22': ['string22']
      }
      'nodes': {
      }
    }
  }
}
}

我试图添加" name"解析令牌,但这导致"加倍"字典元素(这是预期的,因为这种行为在pyparsing文档中有描述)。这可能不是问题,但从技术上讲,节点或属性可以命名为" properties"或者"孩子" (或者我选择的任何东西),所以我不认为这样的解决方案是健壮的。

我还尝试使用setParseAction()将令牌转换为字典片段(我希望我可以将{'key': 'value'}转换为{'properties': {'key': 'value'}}),但这不起作用在所有......

这是否可以直接与pyparsing一起使用?我准备做第二阶段将原始字典转换成我需要的任何结构,但作为一个完美主义者,我更愿意使用单一运行的只有pyparsing的解决方案 - 如果可能的话。

这里有一个参考示例代码(Python 3),它将设备源转换为非结构化的#34;字典。请注意,此代码只是一种简化,它不支持.dts中的所有功能(除字符串,值列表,单元地址,标签等之外的任何数据类型) - 它只是支持字符串属性和节点嵌套。

#!/usr/bin/env python

import pyparsing
import pprint

nodeName = pyparsing.Word(pyparsing.alphas, pyparsing.alphanums + ',._+-', max = 31)
propertyName = pyparsing.Word(pyparsing.alphanums + ',._+?#', max = 31)
propertyValue = pyparsing.dblQuotedString.setParseAction(pyparsing.removeQuotes)
property = pyparsing.Dict(pyparsing.Group(propertyName + pyparsing.Group(pyparsing.Literal('=').suppress() +
        propertyValue) + pyparsing.Literal(';').suppress()))
childNode = pyparsing.Forward()
rootNode = pyparsing.Dict(pyparsing.Group(pyparsing.Literal('/') + pyparsing.Literal('{').suppress() +
        pyparsing.ZeroOrMore(property) + pyparsing.ZeroOrMore(childNode) +
        pyparsing.Literal('};').suppress()))
childNode <<= pyparsing.Dict(pyparsing.Group(nodeName + pyparsing.Literal('{').suppress() +
        pyparsing.ZeroOrMore(property) + pyparsing.ZeroOrMore(childNode) +
        pyparsing.Literal('};').suppress()))

dictionary = rootNode.parseString("""
/ {
    property1 = "string1";
    property2 = "string2";
    node1 {
        property11 = "string11";
        property12 = "string12";
        node11 {
            property111 = "string111";
            property112 = "string112";
        };
    };
    node2 {
        property21 = "string21";
        property22 = "string22";
    };
};
""").asDict()
pprint.pprint(dictionary, width = 120)

1 个答案:

答案 0 :(得分:1)

你真的很亲密。我刚做了以下几件事:

  • 为“属性”和“节点”子部分添加了Group s和结果名称
  • 将一些标点文字更改为CONSTANTS(如果在右括号和分号之间有空格,则Literal("};")将无法匹配,但RBRACE + SEMI将容纳空格)
  • 删除Dict
  • 上最外面的rootNode

代码:

LBRACE,RBRACE,SLASH,SEMI,EQ = map(pyparsing.Suppress, "{}/;=")
nodeName = pyparsing.Word(pyparsing.alphas, pyparsing.alphanums + ',._+-', max = 31)
propertyName = pyparsing.Word(pyparsing.alphanums + ',._+?#', max = 31)
propertyValue = pyparsing.dblQuotedString.setParseAction(pyparsing.removeQuotes)
property = pyparsing.Dict(pyparsing.Group(propertyName + EQ 
                                          + pyparsing.Group(propertyValue)
                                          + SEMI))
childNode = pyparsing.Forward()
rootNode = pyparsing.Group(SLASH + LBRACE
                           + pyparsing.Group(pyparsing.ZeroOrMore(property))("properties")
                           + pyparsing.Group(pyparsing.ZeroOrMore(childNode))("children")
                           + RBRACE + SEMI)
childNode <<= pyparsing.Dict(pyparsing.Group(nodeName + LBRACE
                                             + pyparsing.Group(pyparsing.ZeroOrMore(property))("properties")
                                             + pyparsing.Group(pyparsing.ZeroOrMore(childNode))("children")
                                             + RBRACE + SEMI))

使用asDict转换为dict并使用pprint打印:

pprint.pprint(result[0].asDict())
{'children': {'node1': {'children': {'node11': {'children': [],
                                                'properties': {'property111': ['string111'],
                                                               'property112': ['string112']}}},
                        'properties': {'property11': ['string11'],
                                       'property12': ['string12']}},
              'node2': {'children': [],
                        'properties': {'property21': ['string21'],
                                       'property22': ['string22']}}},
 'properties': {'property1': ['string1'], 'property2': ['string2']}}

您还可以使用pyparsing的dump()类中包含的ParseResults方法,以帮助可视化列表和dict / namespace-style访问结果的原样,无需任何转换调用

print(result[0].dump())

[[['property1', ['string1']], ['property2', ['string2']]], [['node1', [['property11', ['string11']], ['property12', ['string12']]], [['node11', [['property111', ['string111']], ['property112', ['string112']]], []]]], ['node2', [['property21', ['string21']], ['property22', ['string22']]], []]]]
- children: [['node1', [['property11', ['string11']], ['property12', ['string12']]], [['node11', [['property111', ['string111']], ['property112', ['string112']]], []]]], ['node2', [['property21', ['string21']], ['property22', ['string22']]], []]]
  - node1: [[['property11', ['string11']], ['property12', ['string12']]], [['node11', [['property111', ['string111']], ['property112', ['string112']]], []]]]
    - children: [['node11', [['property111', ['string111']], ['property112', ['string112']]], []]]
      - node11: [[['property111', ['string111']], ['property112', ['string112']]], []]
        - children: []
        - properties: [['property111', ['string111']], ['property112', ['string112']]]
          - property111: ['string111']
          - property112: ['string112']
    - properties: [['property11', ['string11']], ['property12', ['string12']]]
      - property11: ['string11']
      - property12: ['string12']
  - node2: [[['property21', ['string21']], ['property22', ['string22']]], []]
    - children: []
    - properties: [['property21', ['string21']], ['property22', ['string22']]]
      - property21: ['string21']
      - property22: ['string22']
- properties: [['property1', ['string1']], ['property2', ['string2']]]
  - property1: ['string1']
  - property2: ['string2']