如何将可能包含列表的xml转换为python中的dict?

时间:2015-09-18 07:49:45

标签: python arrays xml dictionary lxml

我找到了一些将xml转换为dict的解决方案,但它们并没有解决在xml中有一个列表的可能性。 例如,我的xml:

<Body>
    <Count>3</Count>
    <Books>
        <Book>
            <Title>Book 1</Title>
            <Author>Author 1</Author>
        </Book>
        <Book>
            <Title>Book 2</Title>
            <Author>Author 2</Author>
        </Book>
        <Book>
            <Title>Book 3</Title>
            <Author>Author 3</Author>
        </Book>
    </Books>
    <Details>
        <Errors>0</Errors>
    </Details>
</Body>

代码:( https://gist.github.com/jacobian/795571的略微修改版本)

def elem2dict(node):
    """
    Convert an lxml.etree node tree into a dict.
    """
    d = {}
    for e in node.iterchildren():
        key = e.tag.split('}')[1] if '}' in e.tag else e.tag
        if e.text is None:
            continue
        value = e.text if e.text.strip() else elem2dict(e)
        d[key] = value
    return d

结果:

{
    'Count': '3',
    'Books': {
        'Book': {
            'Title': 'Book 3',
            'Author': 'Author 3'
        }
    },
    'Details': {
        'Errors': '0'
    }
}

期望的结果:

{
    'Count': '3',
    'Books':
    [
        { 
            'Title': 'Book 1',
            'Author': 'Author 1'
        },
        { 
            'Title': 'Book 2',
            'Author': 'Author 2'
        },
        { 
            'Title': 'Book 3',
            'Author': 'Author 3'
        }
    ],
    'Details': {
        'Errors': '0'
    }
}

注意:

  • 该列表并非始终标记为BooksBook,但可以是具有此结构的任何标记。
  • 我需要排除当前被驱使的xml属性

5 个答案:

答案 0 :(得分:0)

试试这个。输出不完全符合您的要求,但它会处理同一父级中多种类型的数组。我们的想法是检查密钥是否存在,如果存在则将其转换为数组。

    if key not in d:
        d[key] = value
    elif type(d[key]) is list:
        # already an array, append
        d[key].append(value)
    else:
        # second item with same key: 
        # change the item at `key` to an array retaining
        # the existing item.
        d[key] = [d[key], value]

这给出了:

{'Books': {'Book': [{'Author': 'Author 1', 'Title': 'Book 1'},
                {'Author': 'Author 2', 'Title': 'Book 2'},
                {'Author': 'Author 3', 'Title': 'Book 3'}]},
 'Count': '3',
 'Details': {'Errors': '0'}}

答案 1 :(得分:0)

您是否检查了ElementsTree XML API?

2 bytes(24),  stand for standard mode
10 bytes,  is device'ID (41 20 20 67 72)
6 bytes, is time 
6 bytes, is data
8 bytes, is latitude
2 bytes, battery like this 06= 100% 05=80% 04=60% 03=40% 02=20% 01=10%
10 bytes is  longitude

next byte C is  16 hexadecimal, 0C,  convert it into binary data, 1100 (4 bit)

Bit 3:    if is 1= East longitude  if is 0= West  longitude
Bit 2:      if is 1= North latitude,  if is 0= South latitude
Bit1:      if  is 1= A( GPS position valid) if is 0= V( gps position invalid)
Bit 0: discard

000215: 6 digits, 000 is speed (knot  1 knot =1.852 km/h )  , 215 is direction in degrees.

FFFFF9FF:vehicle_status
2 bytes :back-up data 
2 bytes  :gsm_signal
0D : (convert to decimal=13  13 means the number of satellite gps_signal)
000000034 :   mileage
4 bytes: mobile country code 
2 bytes: mnc
4 bytes: lac
4 bytes: cell_id
2 bytes: record number

结果:

import xml.etree.ElementTree as ET
tree = ET.parse('book.xml')
root = tree.getroot()

for child in root:
    print str(child.tag) + " " + str(child.text)
    for child2 in child:
        print "  " + str(child2.tag) + " " + str(child2.text)
        for child3 in child2:
            print "      " + str(child3.tag) + " " + str(child3.text)

答案 2 :(得分:0)

我在lxml库上有一个简单的解决方案。并且还在OrderedDict中生成结果。见Github

代码:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
from lxml import etree
from collections import OrderedDict

###########################
# xml to dict
###########################

def xml2dict(fpath):
   """"""
    with open(fpath, 'rb') as f:
        # tree type:lml.etree._ElementTree
        tree = etree.parse(f)
        # tree.getroot() type:tree.getroot(), xml.etree._Element
        return tree2dict(tree.getroot())


def tree2dict(node):
    """"""
    subdict = OrderedDict({})
    # iterate over the children of this element--tree.getroot
    for e in node.iterchildren():
        d = tree2dict(e)
        for k in d.keys():
            # handle duplicated tags
            if k in subdict:
                v = subdict[k]
        # use append to assert exception
                try:
                    v.append(d[k])
                    subdict.update({k:v})
                except AttributeError:
                    subdict.update({k:[v,d[k]]})
            else:
                subdict.update(d)
    if subdict:
        return {node.tag: subdict}
    else:
        return {node.tag: node.text}


if __name__ == '__main__':
    print xml2dict('test.xml') 

答案 3 :(得分:0)

如果您尝试制作词典列表,那么它可能会按您的需要工作。请尝试以下

def elem2dict(node):
    """
    Convert an lxml.etree node tree into a dict.
    """
    lis = []
    for e in node.iterchildren():
        d = {}
        key = e.tag.split('}')[1] if '}' in e.tag else e.tag
        if e.text is None:
            continue
        value = e.text if e.text.strip() else elem2dict(e)
        d[key] = value
        lis.append(d)
    return lis

它给出了

[{'Count': '3'},
 {'Books': [{'Book': [{'Title': 'Book 1'}, {'Author': 'Author 1'}]},
            {'Book': [{'Title': 'Book 2'}, {'Author': 'Author 2'}]},
            {'Book': [{'Title': 'Book 3'}, {'Author': 'Author 3'}]}]},
 {'Details': [{'Errors': '0'}]}]

答案 4 :(得分:0)

使用xmltodict lib。以下代码片段可以轻松完成工作:

import xmltodict
with open(file) as fd:
    xml = fd.read()
    xml_dict = xmltodict.parse(xml)

对于项目列表,此库已按以下方式进行解析:

list_books = xml_dict['Body']['Books']['Book']
book_0 = list_books[0]
book_1 = list_books[1]

希望获得帮助!