此刻我一直在尝试使用python解析XML,今天也有一个问题。
您知道如何识别XML中位于同一级别的元素吗?
对于XML示例如下:
<AAA>
<BBB>1</BBB>
<CCC>*</CCC>
<BBB>1</BBB> <--- need to remove
<BBB>1</BBB>
<CCC>*</CCC>
<BBB>1</BBB> <--- need to remove
</AAA>
我知道如何删除位于第一行或最后一行的元素,但是 如果我要删除CCC下方的BBB元素,该怎么做?
答案 0 :(得分:1)
这是使用ElementTree的解决方案。
from xml.etree import ElementTree as ET
XML = """
<AAA>
<BBB>1</BBB>
<CCC>*</CCC>
<BBB>2</BBB>
<BBB>3</BBB>
<CCC>*</CCC>
<BBB>4</BBB>
</AAA>"""
root = ET.fromstring(XML)
# All children of AAA (siblings in document order)
children = root.findall("*")
# Find all BBB elements that immediately follow a CCC element
to_remove = []
for i in range(1, len(children)):
curr = children[i]
prev = children[i-1]
if curr.tag == "BBB" and prev.tag == "CCC":
to_remove.append(curr)
# Remove the found BBB elements
for elem in to_remove:
root.remove(elem)
print(ET.tostring(root).decode("UTF-8"))
输出:
<AAA>
<BBB>1</BBB>
<CCC>*</CCC>
<BBB>3</BBB>
<CCC>*</CCC>
</AAA>