Python For循环XML

时间:2017-08-14 19:44:23

标签: python xml

我需要一些迭代的帮助。我在XML中的根源是sdnEntry。如果我在文档中没有任何迭代使用[0],我可以从中检索文本值,但是当我进行循环时,我收到错误,如“last_names = sdns.getElementsByTagName(”lastName“)。AttributeError: 'NodeList'对象没有属性'getElementsByTagName'“

我的工作代码 - 没有任何迭代看起来像这样:

from xml.dom import minidom
xmldoc = minidom.parse("/Users/cohen/Documents/project/sdn.xml")
sdns = xmldoc.getElementsByTagName("sdnEntry")[0]
last_names = sdns.getElementsByTagName("lastName")[0]
ln = last_names.firstChild.data
types = sdns.getElementsByTagName("sdnType")[0]
t = types.firstChild.data


programs = sdns.getElementsByTagName("programList")[0] #program.firstChild.data
s = programs.getElementsByTagName("program")[0].firstChild.data
akas = sdns.getElementsByTagName("akaList")[0] #child lastName.fourthChild.data
a = akas.getElementsByTagName("aka")[0]
a1 = a.getElementsByTagName("lastName")[0].firstChild.data

addresses = sdns.getElementsByTagName("addressList")[0]
ad1 = addresses.getElementsByTagName("address")[0]
ad2 = ad1.getElementsByTagName("city")[0]
city= ad2.firstChild.data
ad3 = ad1.getElementsByTagName("country")[0]
country = ad3.firstChild.data

这就是我的XML:

<sdnEntry>
    <uid>36</uid>
    <lastName>AEROCARIBBEAN AIRLINES</lastName>
    <sdnType>Entity</sdnType>
    <programList>
      <program>CUBA</program>
    </programList>
    <akaList>
      <aka>
        <uid>12</uid>
        <type>a.k.a.</type>
        <category>strong</category>
        <lastName>AERO-CARIBBEAN</lastName>
      </aka>
    </akaList>
    <addressList>
      <address>
        <uid>25</uid>
        <city>Havana</city>
        <country>Cuba</country>
      </address>
    </addressList>
  </sdnEntry>

以下是我的for循环。 请指教。提前谢谢!

for sdn in sdns:
    for ln in last_names:
        print(ln)
        for t in types:
            print(t)
            for program in programs:
                print (s)
                for aka in akas:
                    print(a1)
                    for address in addresses:
                        print(city)
                        print(country)

我需要将每个sdnEntry存储在我的数据库中,因此我需要每个条目只知道

  • <name> (lastName AEROCARIBBEAN AIRLINES)
  • <sdnType>(实体)`,
  • 来自节目列表的
  • <programs>,例如(程序CUBA),但他们可以更多,
  • <aka><lastName>(AERO-CARIBBEAN)所有这些
  • <address>所有这些人(哈瓦那城国古巴)

我该怎么做?

2 个答案:

答案 0 :(得分:1)

from xml.etree import ElementTree

# I included this list to help
all_nodes = ['sdnEntry', 'uid', 'lastName', 'sdnType', 'programList', 'program', 'akaList',
             'aka', 'uid', 'type', 'category', 'lastName', 'addressList', 'address', 'uid',
             'city', 'country']

required_nodes = ['lastName', 'uid', 'program', 'type', 'category', 'city', 'country']

# required because some names are repeated uid, last
keys = ['sdnEntry_uid', 'lastName', 'program', 'aka_uid', 'type', 'category', 'aka_lastName',
        'address_uid', 'city', 'country']

sdn_data = {}
index = 0

with open('stuff.xml', 'r') as xml_file:
    tree = ElementTree.parse(xml_file)

# iterate all nodes
for node in tree.iter():
    # check if a required node
    if node.tag in required_nodes:
        # add to dictionary
        sdn_data[keys[index]] = node.text
        index += 1

# Use this to test
for key, value in sdn_data.items():
    print(key, value)

<强>输出
sdnEntry_uid 36
lastName AEROCARIBBEAN AIRLINES
程序CUBA
aka_uid 12
输入a.k.a.
类别强 aka_lastName AERO-CARIBBEAN
address_uid 25
哈瓦那市 国家古巴

答案 1 :(得分:0)

不是真正的答案,但我建议您尝试xmltodict。 API更容易处理IMO,如果你确实遇到错误,它们肯定会不那么神秘(即 - 因为完整的结果有效载荷只是一个python dict,它很容易查看,看看事情可能已经消失错)。