从XML文件中删除不需要的标签

时间:2019-05-16 05:37:39

标签: python python-3.x scala xml-parsing

我正在处理其中包含肥皂标签的XML文件。我想在XML清理过程中删除那些肥皂标签。

如何在Python或Scala中实现它。不应使用shell脚本。

示例输入:

<?xml version="1.0" encoding="UTF-8"?>
<soap:Envelope xmlns:soap="http://sample.com/">
   <soap:Body>
      <com:RESPONSE xmlns:com="http://sample.com/">
         <Student>
            <StudentID>100234</StudentID>
            <Gender>Male</Gender>
            <Surname>Robert</Surname>
            <Firstname>Mathews</Firstname>
         </Student>
      </com:RESPONSE>
   </soap:Body>
</soap:Envelope>

预期输出:

<?xml version="1.0" encoding="UTF-8"?>
      <com:RESPONSE xmlns:com="http://sample.com/">
         <Student>
            <StudentID>100234</StudentID>
            <Gender>Male</Gender>
            <Surname>Robert</Surname>
            <Firstname>Mathews</Firstname>
         </Student>
      </com:RESPONSE>

1 个答案:

答案 0 :(得分:0)

这可以帮助您!

from lxml import etree

doc = etree.parse('test.xml')
for ele in doc.xpath('//soap'):
    parent = ele.getparent()
    parent.remove(ele)
print(etree.tostring(doc))