lxml删除元素不起作用

时间:2013-04-12 15:15:08

标签: python xml lxml

我正在尝试使用lxml删除XML元素,方法似乎没问题,但它不起作用。多数民众赞成我的代码:

import lxml.etree as le
f = open('Bird.rdf','r')
doc=le.parse(f)
for elem in doc.xpath("//*[local-name() = 'dc' and namespace-uri() = 'http://purl.org/dc/terms/']"):
    parent=elem.getparent().remove(elem)
print(le.tostring(doc))

示例XML文件:

<rdf:RDF xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:dc="http://purl.org/dc/terms/"> 

        <wo:Class rdf:about="/nature/life/Bird#class">
                    <dc:description>Birds are a class of vertebrates. They are bipedal, warm-blooded, have a
                        covering of feathers, and their front limbs are modified into wings. Some birds, such as
                        penguins and ostriches, have lost the power of flight. All birds lay eggs. Because birds
                        are warm-blooded, their eggs have to be incubated to keep the embryos inside warm, or
                        they will perish</dc:description>
        </wo:Class>
</rdf:RDF>                  

1 个答案:

答案 0 :(得分:4)

您的问题是local-name是'description',而不是'dc'(名称空间别名)。您可以将命名空间传递给xpath函数并直接编写xpath,如下所示:

import lxml.etree as le

txt="""<rdf:RDF xmlns:rdf="http://www.w3.org/2000/01/rdf-schema#" xmlns:dc="http://purl.org/dc/terms/"
    xmlns:wo="http:/some/wo/namespace">

    <wo:Class rdf:about="/nature/life/Bird#class">
       <dc:description>Birds are a class of vertebrates. They are bipedal, warm-blooded, have a
                        covering of feathers, and their front limbs are modified into wings. Some birds, such as
                        penguins and ostriches, have lost the power of flight. All birds lay eggs. Because birds
                        are warm-blooded, their eggs have to be incubated to keep the embryos inside warm, or
                        they will perish</dc:description>
    </wo:Class>
</rdf:RDF>
"""

namespaces = { 
    "rdf":"http://www.w3.org/2000/01/rdf-schema#",
    "dc":"http://purl.org/dc/terms/",
    "wo":"http:/some/wo/namespace" }

doc=le.fromstring(txt)
for elem in doc.xpath("//dc:description", namespaces=namespaces):
    parent=elem.getparent().remove(elem)
print(le.tostring(doc))