由于存在命名空间,使用python解析xml时出错

时间:2020-05-25 07:57:52

标签: python python-3.x xml xml-parsing xml-namespaces

使用下面的脚本从XML下面删除基于图像类型的子节点,但是由于xmlns标头而导致下面的错误,因此我删除了该消息,但仍然尝试仅从5中删除3个子节点。

可以请您检查吗?

<?xml version="1.0" encoding="UTF-8"?>
<!-- Copyright (c) All rights reserved. -->
<dummy_list xmlns="https://dummy_list_file"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="template.xsd">
   <dummy_capability>
       <dummy_type>1</dummy_type>
       <dummy_type_string>dummy_3700E</dummy_type_string>
       <dummy_image>c3700</dummy_image>
       <dummy_string>dummy3702E,dummy3701E</dummy_string>
       <dummy_capabilities>
           <CSTREAMS>True</CSTREAMS>
           <ABC_SUPPORTED>True</ABC_SUPPORTED>
           <THRESHOLD_SUPPORTED>True</THRESHOLD_SUPPORTED>
           <FABRIC_CABLE>True</FABRIC_CABLE>
       </dummy_capabilities>
   </dummy_capability>
   <dummy_capability>
       <dummy_type>2</dummy_type>
       <dummy_type_string>dummy_2700E</dummy_type_string>
       <dummy_image>c2700</dummy_image>
       <dummy_string>dummy2702E,dummy2701E</dummy_string>
       <dummy_capabilities>
           <CSTREAMS>True</CSTREAMS>
           <ABC_SUPPORTED>True</ABC_SUPPORTED>
           <THRESHOLD_SUPPORTED>True</THRESHOLD_SUPPORTED>
           <FABRIC_CABLE>True</FABRIC_CABLE>
       </dummy_capabilities>
   </dummy_capability>
   <dummy_capability>
       <dummy_type>3</dummy_type>
       <dummy_type_string>dummy_1700E</dummy_type_string>
       <dummy_image>c1700</dummy_image>
       <dummy_string>dummy1702E,dummy1701E</dummy_string>
       <dummy_capabilities>
           <CSTREAMS>True</CSTREAMS>
           <ABC_SUPPORTED>True</ABC_SUPPORTED>
           <THRESHOLD_SUPPORTED>True</THRESHOLD_SUPPORTED>
           <FABRIC_CABLE>True</FABRIC_CABLE>
       </dummy_capabilities>
   </dummy_capability>
   <dummy_capability>
       <dummy_type>4</dummy_type>
       <dummy_type_string>dummy_4700E</dummy_type_string>
       <dummy_image>c4700</dummy_image>
       <dummy_string>dummy4702E,dummy4701E</dummy_string>
       <dummy_capabilities>
           <CSTREAMS>True</CSTREAMS>
           <ABC_SUPPORTED>True</ABC_SUPPORTED>
           <THRESHOLD_SUPPORTED>True</THRESHOLD_SUPPORTED>
           <FABRIC_CABLE>True</FABRIC_CABLE>
       </dummy_capabilities>
   </dummy_capability>
   <dummy_capability>
       <dummy_type>4</dummy_type>
       <dummy_type_string>dummy_4700E</dummy_type_string>
       <dummy_image>c4700</dummy_image>
       <dummy_string>dummy4702E,dummy4701E</dummy_string>
       <dummy_capabilities>
           <CSTREAMS>True</CSTREAMS>
           <ABC_SUPPORTED>True</ABC_SUPPORTED>
           <THRESHOLD_SUPPORTED>True</THRESHOLD_SUPPORTED>
           <FABRIC_CABLE>True</FABRIC_CABLE>
       </dummy_capabilities>
   </dummy_capability>
   <dummy_capability>
       <dummy_type>4</dummy_type>
       <dummy_type_string>dummy_4700E</dummy_type_string>
       <dummy_image>c4700</dummy_image>
       <dummy_string>dummy4702E,dummy4701E</dummy_string>
       <dummy_capabilities>
           <CSTREAMS>True</CSTREAMS>
           <ABC_SUPPORTED>True</ABC_SUPPORTED>
           <THRESHOLD_SUPPORTED>True</THRESHOLD_SUPPORTED>
           <FABRIC_CABLE>True</FABRIC_CABLE>
       </dummy_capabilities>
   </dummy_capability>
</dummy_list>
#!/router/bin/python3-3.6.3
from xml.etree.ElementTree import ElementTree
tree = ElementTree()
tree.parse('dummy.xml')

root = tree.getroot()

for child in root:
    if (child.find('dummy_image').text == 'c3700'):
        print("Removing child: " + child.find('dummy_image').text)
        root.remove(child)

tree.write('out.xml')
  1. 我该如何解析也存在?
xmlns="https://dummy_list_file"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="template.xsd
  1. 为什么不从垂直图像类型中删除所有子节点?

1 个答案:

答案 0 :(得分:0)

另一种方法。

from simplified_scrapy import SimplifiedDoc,utils
import json
xml = utils.getFileContent('dummy.xml')
doc = SimplifiedDoc(xml)
dummy_capabilitys = doc.selects('dummy_image').contains('c3700').parent
for dummy_capability in dummy_capabilitys:
  dummy_capability.repleaceSelf("")
utils.saveFile("out.xml",doc.html)
# Get attributes
root = doc.select('dummy_list')
print (root["xmlns"],root["xmlns:xsi"],root["xsi:schemaLocation"])

结果:

https://dummy_list_file http://www.w3.org/2001/XMLSchema-instance template.xsd

还有更多示例:https://github.com/yiyedata/simplified-scrapy-demo/tree/master/doc_examples