这是我的XML:
<beans>
<property name = "type1">
<list>
<bean class = "bean1">
<property name = "typeb">
<value>foo</value>
</property>
</bean>
<bean class = "bean2">
<property name ="typeb">
<value>bar</value>
</property>
</bean>
</list>
</property>
<property name = "type2">
<list>
<bean class = "bean3">
<list>
<property name= "typec">
<sometags/>
</property>
<property name= "typed">
<list>
<value>foo</value>
<value>bar</bar>
</list>
</property>
</list>
</bean>
</list>
</property>
</beans>
现在我们要做的是扫描并删除这些元素:
<bean class = "bean1">
<property = "typeb">
<value>foo</value>
</property>
</bean>
和
<value>foo</value>
(来自property class =“typed”元素)。
现在要做到这一点,我想做的是这样的事情:
for element in root.iter('value'):
if element.text == 'foo':
p1= element.getParent()
if p1.tag == 'list': #second case scenario, remove just the value tag.
p1.remove(element)
else: #first case scenario - remove entire bean
p2 = p1.getParent()
p3 = p2.getParent()
p3.remove(p2)
但是ElementTree
不支持孩子看到其父元素。
实现这一目标的有效方法是什么?鉴于它是一个深度XML结构,我不太喜欢在每个级别检查标记类型的递归函数的想法。
答案 0 :(得分:1)
使用ElementTree,使用parent查找相关的子项:
>>> parent = root.find('.//bean[@class="bean1"]')
>>> parent
<Element 'bean' at 0x10eb31550>
>>> parent.find('.//value').text
'foo'
答案 1 :(得分:1)
以下是我如何解决它:
#gives you a list of every parent,child tuple
def iterparent(tree):
for parent in tree.getiterator():
for child in parent:
yield parent, child
#recursive function. Deletes the given child node, from n parents back.
#If n = 0 it deletes just the child.
def removeParent(root, childToRemove, n):
for parent, child in iterparent(root):
if (childToRemove == child):
if n>0:
removeParent(root, parent, n-1)
else:
parent.remove(child)
for parent, child in iterparent(root):
if (child.tag == 'value' and (child.text in valuesToDelete):
if (parent.tag == 'list'):
removeParent(root, child, 0)
else:
removeParent(root, child, 2)
它实际上相当优雅。我喜欢。
就我的目的而言,这种方法效果很好,但人们可能会遇到各种各样的元素结构和深度问题。
答案 2 :(得分:0)
lxml.etree
模块有getparent
方法。给出你的示例XML(好吧,在修复了不匹配的结束标记之后),我可以这样做:
>>> from lxml import etree
>>>
>>> with open('data.xml') as fd:
... doc = etree.parse(fd)
...
>>> matches = doc.xpath('//value[text()="foo"]')
>>> element = matches[0]
>>> etree.tostring(element)
'<value>foo</value>\n '
>>> parent = element.getparent()
>>> print etree.tostring(element)
<value>foo</value>
>>> parent = element.getparent()
>>> print etree.tostring(parent)
<property name="typeb">
<value>foo</value>
</property>
>>> parent = parent.getparent()
>>> print etree.tostring(parent)
<bean class="bean1">
<property name="typeb">
<value>foo</value>
</property>
</bean>
..等等。