Question

问题：我试图从提供的XML中获得想要的结果。在尝试的代码中，所有字典都混杂起来，在条形键中我得到了ipsum值。

通缉结果：

{'bar': [{'a': 'foo', 'b': 'bar', 'c': 'bazz', 'd': 'boo', 'e': 'bing'}, {'a': 'foo', 'b': 'bar', 'c': 'bazz', 'd': 'boo'}], 'ipsum': {'a': 'lorem', 'b': 'ipsum', 'c': 'dolar', 'd': 'sit', 'e': 'amet'}, {'a': 'lorem', 'b': 'ipsum', 'c': 'dolor', 'd': 'sit'}

额外细节：是的我想删除字典值列表第一部分的foobar的None值。是的，词典的第二部分是foobar

尝试过的代码：

data = {}
toplevel = {}
secondlevel = {}
details = []

for alphabet in root.findall("*[@type='3']"):
    toplevel.clear()
    secondevel.clear()

    for i in alphabet:
        toplevel.update({i.tag: i.text})
        toplevel = {k: v for k, v in toplevel.items() if v} #Remove Keys if the value equals None or "          /n" (Doesn't work)

    for foobar in toplevel.find("foobar"):
        secondlevel.update({foobar.tag: foobar.text})
        continue


    details.append(toplevel, secondlevel])

    data.update({alphabet.find("b").text: details})

XML ：

<lorem>
    <foo type="3">
        <a>foo</a>
        <b>bar</b>
        <c>bazz</c>
        <d>boo</d>
        <e>bing</e>

        <foobar>
            <a>foo</a>
            <b>bar</b>
            <c>bazz</c>
            <d>boo</d>
        <foobar>
    </foo>

    <foo type="3">
        <a>lorem</a>
        <b>ipsum</b>
        <c>dolor</c>
        <d>sit/d>
        <e>amet</e>

        <foobar>
            <a>lorem</a>
            <b>ipsum</b>
            <c>dolor</c>
            <d>sit</d>
        <foobar>
    </foo>
</lorem>

Answer 1

我认为PicklingTools XML to dict转换器可以准确地给你你想要什么。

请参阅：XML to dict translation doc

你可能不得不使用旗帜来获得你想要的东西，（我想你会想要删除这些属性，所以你不要选择“类型”作为键）。也许使用XML_LOAD_DROP_ALL_ATTRS

你想要这样的东西来处理XML文件

<top>
  <a>1</a>
  <b>2.2</b>
  <c>three</c>
</top>

并将其转换为字典：

>>> from xmlloader import *
>>> example = file('example.xml', 'r')
>>> xl = StreamXMLLoader(example, 0)  # 0 = All defaults on options
>>> result = xl.expectXML()
>>> print result
{'top': {'a': '1', 'c': 'three', 'b': '2.2'}}

代码有两个版本：纯Python版本和C扩展模块。第二个是如果你需要C的原始速度来转换大型XML文件。

我的复杂词典令人喋喋不休

1 个答案: