Python XML解析子项和子项属性

时间:2016-12-02 00:44:53

标签: python python-3.x parsing xpath xml-parsing

过去两周我一直在尝试这种方法,并查看用于XML解析的python文档。我仍然可以弄清楚它是否是一个Xpath的东西。如果有人能提供一些帮助,我将非常感激。

我的XML文件有很多孩子,我使用root.findall()来获取myAccessPoints的属性,然后在它下面有三个孩子我有一个元素,我想要提取几个属性。但是,到目前为止,我只能使用2 for for循环。

import xml.etree.ElementTree as ET



def apData():

    tree = ET.parse("project.xml")

    root = tree.getroot()
    for topLevels in root.findall("./myAccessPoints//*[@id]"):
        myApId = topLevels.get('id')
        print("AP:%s" % myApId)
        print()
        #return myApId

    for radio in root.findall("./accessPoints/accessPoint/radio/*"):
        rChannel = radio.get('primaryNumber')
        rMac = radio.get('mac')
        rSsid = radio.get('primaryNumber')
        print(rChannel, rMac, rSsid)
        #return rChannel, rMac, rSsid

以下是XML文件的示例:

<?xml version="1.0" encoding="UTF-8"?>
<project>
  <maps>
    <map id="0" name="floorplan" pixelsPerMeter="47.808212118953044" type="fspl"/>
  </maps>
  <accessPoints>
    <accessPoint id="0" userDefinedPosition="false">
      <radio type="measured">
        <accessPointMeasurement mac="a0:63:91:21:c4:f8" ssid="Eggs" primaryNumber="7" primaryFrequencyMhz="2442" centerNumber="7" bandwidthMhz="20" security="WPA2" informationElements="000445676773010882840b162430486c0301072a01042f010430140100000fac040100000fac040100000fac020c0032040c1218602d1afc181fffff0000000000000000000000000000000000000000003d16070017000000000000000000000000000000000000004a0e14000a002c01c8001400050019007f0101dd890050f204104a0001101044000102103b00010310470010177b8b3ae292d7c44b93d4616ff30e7e1021000d4e4554474541522c20496e632e1023000a574e44523334303076331024000a574e44523334303076331042000230311054000800060050f20400011011000a574e4452333430307633100800020004103c0001031049000600372a000120dd090010180204f0040000dd180050f2020101800003a4000027a4000042435e0062322f00">
          <technologies>
            <technology band="802.11g"/>
            <technology band="802.11b"/>
            <technology band="802.11n"/>
          </technologies>
        </accessPointMeasurement>
      </radio>
    </accessPoint>
    <accessPoint id="1" userDefinedPosition="false">

最终我会采用像这样的接入点元素属性 - &gt;

accessPoint id

accessPointMeasurement mac,ssid,primaryNumber

技术乐队

技术乐队

技术乐队

一些accessPoint元素有两组无线电,所以我必须两次获得accessPointMeasurement属性。

我想我必须创建一个班级,在课堂上我必须自己制作列表或词典。

我不是要求任何人为我做任何事情,除了了解如何在一个for循环中获取每个接入点及其属性(如果它甚至可能)。< / p>

感谢您的帮助。

1 个答案:

答案 0 :(得分:0)

我扩展了XML以包含更多accessPointsradios,并使用lxml库来访问其xpath功能。嵌套循环。

from lxml import etree

tree = etree.parse('temp.xml')
accessPoints = tree.xpath('.//accessPoint')

for accessPoint in accessPoints:
    print ('accessPoint id:', accessPoint.attrib['id'])
    radios = accessPoint.xpath('radio')
    for radio in radios:
        accessPointMeasurement = radio.xpath('accessPointMeasurement')
        print ('\taccessPointMeasurement: ', accessPointMeasurement[0].attrib)
        technologies = radio.xpath('.//technology')
        for technology in technologies:
            print ('\t\ttechnology: ', technology.attrib)

结果如下:

accessPoint id: 0
    accessPointMeasurement:  {'security': 'WPA2', 'informationElements': '000445676773010882840b162430486c0301072a01042f010430140100000fac040100000fac040100000fac020c0032040c1218602d1afc181fffff0000000000000000000000000000000000000000003d16070017000000000000000000000000000000000000004a0e14000a002c01c8001400050019007f0101dd890050f204104a0001101044000102103b00010310470010177b8b3ae292d7c44b93d4616ff30e7e1021000d4e4554474541522c20496e632e1023000a574e44523334303076331024000a574e44523334303076331042000230311054000800060050f20400011011000a574e4452333430307633100800020004103c0001031049000600372a000120dd090010180204f0040000dd180050f2020101800003a4000027a4000042435e0062322f00', 'bandwidthMhz': '20', 'centerNumber': '7', 'mac': 'a0:63:91:21:c4:f8', 'ssid': 'Eggs', 'primaryFrequencyMhz': '2442', 'primaryNumber': '7'}
        technology:  {'band': '802.11g'}
        technology:  {'band': '802.11b'}
        technology:  {'band': '802.11n'}
accessPoint id: 2
    accessPointMeasurement:  {'security': 'WPA2', 'informationElements': '000445676773010882840b162430486c0301072a01042f010430140100000fac040100000fac040100000fac020c0032040c1218602d1afc181fffff0000000000000000000000000000000000000000003d16070017000000000000000000000000000000000000004a0e14000a002c01c8001400050019007f0101dd890050f204104a0001101044000102103b00010310470010177b8b3ae292d7c44b93d4616ff30e7e1021000d4e4554474541522c20496e632e1023000a574e44523334303076331024000a574e44523334303076331042000230311054000800060050f20400011011000a574e4452333430307633100800020004103c0001031049000600372a000120dd090010180204f0040000dd180050f2020101800003a4000027a4000042435e0062322f00', 'bandwidthMhz': '20', 'centerNumber': '7', 'mac': 'a0:63:91:21:c4:f8', 'ssid': 'Eggs', 'primaryFrequencyMhz': '2442', 'primaryNumber': '7'}
        technology:  {'band': '802.11g'}
        technology:  {'band': '802.11b'}
        technology:  {'band': '802.11n'}
    accessPointMeasurement:  {'security': 'WPA2', 'informationElements': '000445676773010882840b162430486c0301072a01042f010430140100000fac040100000fac040100000fac020c0032040c1218602d1afc181fffff0000000000000000000000000000000000000000003d16070017000000000000000000000000000000000000004a0e14000a002c01c8001400050019007f0101dd890050f204104a0001101044000102103b00010310470010177b8b3ae292d7c44b93d4616ff30e7e1021000d4e4554474541522c20496e632e1023000a574e44523334303076331024000a574e44523334303076331042000230311054000800060050f20400011011000a574e4452333430307633100800020004103c0001031049000600372a000120dd090010180204f0040000dd180050f2020101800003a4000027a4000042435e0062322f00', 'bandwidthMhz': '20', 'centerNumber': '7', 'mac': 'a0:63:91:21:c4:f8', 'ssid': 'Eggs', 'primaryFrequencyMhz': '2442', 'primaryNumber': '7'}
        technology:  {'band': '802.11g'}
        technology:  {'band': '802.11b'}
        technology:  {'band': '802.11n'}