XML属性为空

时间:2018-11-02 20:19:00

标签: python xml

我正在从文件中将XML对象读入Windows 10上的Python 3.6中。这是xml的示例:

<?xml version="1.0"?>
<rss version="2.0" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:dc="http://purl.org/dc/elements/1.1/">
    <channel>
        <item>     
            <BurnLocation>@ 32 40 52.99 @ 80 57 33.00</BurnLocation>
            <geo:lat>32.681389</geo:lat>
            <geo:long>-80.959167</geo:long>
            <County>Jasper</County>
            <BurnType>PD</BurnType> 
            <BurnTypeDescription>PILED DEBRIS</BurnTypeDescription> 
            <Acres>2</Acres> 
        </item>
        <item>     
            <BurnLocation>@ 33 29 34.26 @ 81 15 52.89</BurnLocation>
            <geo:lat>33.492851</geo:lat>
            <geo:long>-81.264694</geo:long>
            <County>Orangebrg</County>
            <BurnType>PD</BurnType> 
            <BurnTypeDescription>PILED DEBRIS</BurnTypeDescription> 
            <Acres>1</Acres> 
        </item>
    </channel>
</rss>

这是我的代码的版本:

import os
import xml.etree.ElementTree as ET

local_filename = os.path.join('C:\\Temp\\test\\', filename)
tree = ET.parse(local_filename)
root = tree.getroot()

for child in root:
    for next1 in child:
        for next2 in next1:
            print(next2.tag,next2.attrib)

我遇到的问题是我似乎无法隔离子标记的属性,它们以空字典的形式出现。这是结果的示例:

   BurnLocation {}
   {http://www.w3.org/2003/01/geo/wgs84_pos#}lat {}
   {http://www.w3.org/2003/01/geo/wgs84_pos#}long {}
   County {}
   BurnType {}
   BurnTypeDescription {}
   Acres {}
   BurnLocation {}
   {http://www.w3.org/2003/01/geo/wgs84_pos#}lat {}
   {http://www.w3.org/2003/01/geo/wgs84_pos#}long {}
   County {}
   BurnType {}
   BurnTypeDescription {}
   Acres {}

我正在尝试打印标签(例如Jasper)中的项目,我在做什么错?

1 个答案:

答案 0 :(得分:1)

您想要的是每个元素的text内容,而不是它们的属性。

应该这样做(对于固定的文件名,略有简化):

import xml.etree.ElementTree as ET

tree = ET.parse('sample.xml')
root = tree.getroot()

for child in root:
    for next1 in child:
        for next2 in next1:
            print ('{} = "{}"'.format(next2.tag,next2.text))
        print ()

不过,我将通过以下方式对其进行简化:

  1. 一次找到所有<item>个元素,并且
  2. 然后遍历其子元素。

因此

import xml.etree.ElementTree as ET

tree = ET.parse('sample.xml')

for item in tree.findall('*/item'):
    for elem in list(item):
        print ('{} = "{}"'.format(elem.tag,elem.text))
    print ()