Python读取具有多个命名空间的xml文件

时间:2017-06-09 06:32:37

标签: python xml

  <?xml version="1.0" encoding="UTF-8"?>
        <country:list version="3.0" xmlns:country="http://Some/www.home.com" xmlns:region="http://some/www.hello.com" xmlns:Location="http://Some/www.home.com">
            <country:Region111>
                <Some_child_tags>
                    <region:tag1 name="1">some contents in country:Region111 </region:tag1>

                    <tags>

                            .
                            .
                            .
                            .
                    </tags>
                    <region:tag1 name="2">Some other contents in country:Region111</region:tag1>
                </Some_child_tags>
            </country:Region111>

            <Location:Region222>
            <Some_child_tags>
                    <region:tag1 name="1">some contents in Location:Region222</region:tag1>
                    <tags>

                            .
                            .
                            .
                            .
                    </tags>
                    <region:tag1 name="2">Some other contents in Location:Region222</region:tag1>
                </Some_child_tags>
            </Location:Region222>
        </country:list>
  

我想检索所有<region:tag1>代码标记和属性   即将到来的价值观   <country:Region111>...</country:Region111>不在     <Location:Region222> ....</Location:Region222>.所以决赛   输出应该是以下

name 1 some contents in country:Region111                                
name 2 Some other contents in country:Region111
      It should eliminate the <region:tag1> contents that is coming from <Location:Region222>.

1 个答案:

答案 0 :(得分:1)

您的输入XML文档应如下所示(有效):

<?xml version="1.0" encoding="UTF-8"?>
<country:list version="3.0" xmlns:country="http://Some/www.home.com" xmlns:region="http://some/www.hello.com">
<country:Region111>
     <Some_child_tags>
      <region:tag1>some contents</region:tag1>
     </Some_child_tags>
</country:Region111>
</country:list>

使用xml.etree.ElementTree模块的解决方案:

import xml.etree.ElementTree as ET

tree = ET.parse("yourfile.xml")
root = tree.getroot()
tag1 = root.find('.//{http://some/www.hello.com}tag1')  # accessing tag with namespace

print(tag1.text)

输出:

some contents