Question

我是python的新手，只是解析一个xml文件。我一直在看xml文件，我非常简单的python代码以及一些使用python解析xml的教程，但似乎无法弄清楚我的问题。我开始从xml文件的根元素中删除一些属性，但发现了一些奇怪的东西（请参见下文）：

<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type='text/xsl' href='cci2html.xsl'?>
<list xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
            xsi:schemaLocation="http://blah.edu/blah.xsd" 
            ***xmlns="http://blah.edu/case"***>
    <metadata>
      <version>2016-06-27</version>
      <publishdate>2016-06-27</publishdate>
    </metadata>
    <items>
      <item id="XXX-001545">
        <status>draft</status>
        <publishdate>2010-05-11</publishdate>
        <contributor>John Doe</contributor>
        <definition>Definition 1</definition>
        <type>policy</type>
        <references>
            <reference creator="Tester 1" title="Doc1" version="3" 
                location="http://www.blah.com" index="1" />
            <reference creator="Tester 2" title="Doc2" version="1" 
                location="http://www.blah.com" index="2" />
            <reference creator="Tester 3" title="Doc3" version="4" 
                location="http://www.blah.com" index="3" />
        </references>
      </item>
      <item id="XXX-001546">
        <status>draft</status>
        <publishdate>2010-05-11</publishdate>
        <contributor>Jane Doe</contributor>
        <definition>Definition 2</definition>
        <type>policy</type>
        <references>
            <reference creator="Tester 1" title="Doc1" version="3" 
                location="http://www.blah.com" index="1" />
            <reference creator="Tester 2" title="Doc2" version="1" 
                location="http://www.blah.com" index="2" />
            <reference creator="Tester 3" title="Doc3" version="4" 
                location="http://www.blah.com" index="3" />
        </references>
      </item>
  </items>
</list>

我的Python代码：

tree = ET.parse("test.xml")
root = tree.getroot()
print(root.tag, root.attrib)
for cci in root.iter('items'):
  print('hello')
  print(cci.tag, cci.attrib)

输出：

{http://blah.edu/case}list {'{http://www.w3.org/2001/XMLSchema-instance}
schemaLocation': 'http://blah.edu/blah.xsd'}

当我离开xml文件中根元素的最后一部分xmlns =“ http://blah.edu/case”时，我得到了输出（如输出行所示），但它没有走放入代码的root.iter（'items'）部分。如果我删除了最后一部分，它将进入for循环并遍历所有元素。

正确的输出：

    list {'{http://www.w3.org/2001/XMLSchema-instance}schemaLocation': 
          'http://blah.edu/blah.xsd'}
    hello
    item {'id': 'XXX-001545'}
    hello
    item {'id': 'XXX-001546'}

为什么？我确实需要最后一部分，因为它不是我的xml文件。

使用Python 3进行元素树迭代

0 个答案: