使用lxml删除整个节点

时间:2016-10-07 16:38:55

标签: python xml lxml

我有一个xml文档,如下所示:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
     xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
    <groupId>company</groupId>
    <artifactId>art-id</artifactId>
    <version>RELEASE</version>
</parent>

<properties>
    <tomcat.username>admin</tomcat.username>
    <tomcat.password>admin</tomcat.password>
</properties>

<dependencies>
    <dependency>
        <groupId>asdf</groupId>
        <artifactId>asdf</artifactId>
        <version>[3.8,)</version>
    </dependency>
    <dependency>
        <groupId>asdf</groupId>
        <artifactId>asdf</artifactId>
        <version>[4.1,)</version>
    </dependency>
</dependencies>

如何删除整个节点&#34;依赖关系&#34;?

我已经查看了有关stackoverflow的其他问题和答案,不同之处在于此xml的命名空间方面,其他问题要求删除子元素,如&#34;依赖&#34;而我想删除整个节点&#34;依赖。&#34;有没有一种简单的方法可以使用lxml来删除整个节点?

以下内容给出了一个&#39; NoneType&#39;对象没有属性&#39;删除&#39;错误:

from lxml import etree as ET

tree = ET.parse('pom.xml')
namespace = '{http://maven.apache.org/POM/4.0.0}'
root = ET.Element(namespace+'project')
root.find(namespace+'dependencies').remove()

2 个答案:

答案 0 :(得分:1)

首先,抓住根节点。由于<project ... >(vs <project .../>)&#34;父母&#34; dependencies的元素是project。文档中的示例:

import xml.etree.ElementTree as ET  
tree = ET.parse('country_data.xml')  
root = tree.getroot() 

获得root权限后,请检查root.tag(),它应该是&#34; project&#34;。

然后执行root.remove(root.find('dependencies')),其中rootproject节点。

如果它是<project .../>那么它将是无效的XML,因为必须有一个根元素。不过,我可以确切地看到你来自哪里。

答案 1 :(得分:1)

您可以为命名空间创建一个dict映射,找到节点然后调用 root.remove 传递节点,不要调用< em> .remove 在节点上:

x = """<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
     xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
    <groupId>company</groupId>
    <artifactId>art-id</artifactId>
    <version>RELEASE</version>
</parent>    
<properties>
    <tomcat.username>admin</tomcat.username>
    <tomcat.password>admin</tomcat.password>
</properties>    
<dependencies>
    <dependency>
        <groupId>asdf</groupId>
        <artifactId>asdf</artifactId>
        <version>[3.8,)</version>
    </dependency>
    <dependency>
        <groupId>asdf</groupId>
        <artifactId>asdf</artifactId>
        <version>[4.1,)</version>
    </dependency>
</dependencies>
</project>"""
import lxml.etree as et
from StringIO import StringIO

tree = et.parse(StringIO(x))
root =tree.getroot()

nsmap = {"mav":"http://maven.apache.org/POM/4.0.0"}

root.remove(root.find("mav:dependencies", namespaces=nsmap))

print(et.tostring(tree))

哪会给你:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
    <groupId>company</groupId>
    <artifactId>art-id</artifactId>
    <version>RELEASE</version>
</parent>    
<properties>
    <tomcat.username>admin</tomcat.username>
    <tomcat.password>admin</tomcat.password>
</properties>   
</project>