So I'm trying to remove element (data) if its subElement value have text "1"
.
I did some research found how to remove element value, but I have no idea how to remove the grandparent of the element.
I know I can find text by searching this way and then remove element, but that's all I could find.
e = root.xpath('.//value[text()="1"]')
e.getParent().remove
My XML document looks like this:
<root>
<Data>
<FirstName>Name</FirstName>
<EMail>email@email.com</EMail>
<Number>123</Number>
<delete>
<value>0</value>
</delete>
</Data>
<Data>
<FirstName>Name</FirstName>
<EMail>some@email.com</EMail>
<delete>
<value>1</value>
</delete>
<Number>456</Number>
</Data>
</root>
Expect result:
<root>
<Data>
<FirstName>Name</FirstName>
<EMail>email@email.com</EMail>
<Number>123</Number>
<delete>
<value>0</value>
</delete>
</Data>
</root>
Basically i want to remove element data if element's value contains specific text.
答案 0 :(得分:1)
考虑XSLT(与XPath兄弟),这是一种专用语言,旨在将XML文件转换为其他XML。 Python的lxml模块除了运行XPath 1.0外,还可以运行XSLT 1.0脚本。
XSLT (另存为.xsl文件,一个特殊的.xml文件)
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output version="1.0" encoding="UTF-8" indent="yes" />
<xsl:strip-space elements="*"/>
<!-- Identity Transform -->
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<!-- Removes entire Data node with delete child value equal to 1 -->
<xsl:template match="Data[delete/value='1']"/>
</xsl:transform>
Python (没有for
循环或if
逻辑)
import lxml.etree as et
# LOAD XML AND XSL
xml = et.parse('input.xml')
xsl = et.parse('xslt_script.xsl')
# TRANSFORM INPUT
transform = et.XSLT(xsl)
result = transform(xml)
# SAVE TO FILE
with open('output.xml', 'wb') as f:
f.write(result)
答案 1 :(得分:1)
如果要删除元素,则需要其父元素。在这种情况下,Data
的父级是root
(也恰好是根元素)。
不是选择value
,而是使用predicate选择Data
并将其从root
中删除,就像这样...
Python
from lxml import etree
tree = etree.parse("test.xml")
root = tree.getroot()
for data in tree.xpath("./Data[delete/value='1']"):
root.remove(data)
print(etree.tostring(tree, pretty_print=True).decode())
打印输出
<root>
<Data>
<FirstName>Name</FirstName>
<EMail>email@email.com</EMail>
<Number>123</Number>
<delete>
<value>0</value>
</delete>
</Data>
</root>
如果我需要删除一个元素,则几乎不会使用getparent()
;我专门选择父母。如果我需要进行更复杂的转换,可以使用Parfait建议的XSLT。