请记住,我是Python的新手。我试图将一些XML节点从sample1.xml复制到out.xml,如果它在sample2.xml中不存在。
这是我在被困之前已经走了多远
import xml.etree.ElementTree as ET
tree = ET.ElementTree(file='sample1.xml')
addtree = ET.ElementTree(file='sample2.xml')
root = tree.getroot()
addroot = addtree.getroot()
for adel in addroot.findall('.//cars/car'):
for el in root.findall('cars/car'):
with open('out.xml', 'w+') as f:
f.write("BEFORE\n")
f.write(el.tag)
f.write("\n")
f.write(adel.tag)
f.write("\n")
f.write("\n")
f.write("AFTER\n")
el = adel
f.write(el.tag)
f.write("\n")
f.write(adel.tag)
我不知道我错过了什么,但它只复制了实际的“tag
”本身。
输出:
BEFORE
car
car
AFTER
car
car
所以我错过了子节点,还有<
,>
,</
,>
标签。预期结果如下。
sample1.xml:
<cars>
<car>
<use-car>0</use-car>
<use-gas>0</use-gas>
<car-name />
<car-key />
<car-location>hawaii</car-location>
<car-port>5</car-port>
</car>
</cars>
sample2.xml:
<cars>
<old>
1
</old>
<new>
8
</new>
<car />
</cars>
out.xml中的预期结果(最终产品)
<cars>
<old>
1
</old>
<new>
8
</old>
<car>
<use-car>0</use-car>
<use-gas>0</use-gas>
<car-name />
<car-key />
<car-location>hawaii</car-location>
<car-port>5</car-port>
</car>
</cars>
所有其他节点old
和new
必须保持不变。我只是试图将<car />
替换为所有子节点和孙子节点(如果存在的话)。
答案 0 :(得分:3)
首先,您的XML有几个小问题:
cars
代码缺少/
new
标记错误地显示old
,应阅读new
第二,免责声明:我的解决方案有其局限性 - 特别是,它不会反复将car
节点从 sample1 替换为多个 sample2 中的斑点。但它适用于您提供的示例文件。
第三次:感谢access ElementTree node parent node上的前几个答案 - 他们告知了下面get_node_parent_info
的实施情况。
最后,代码:
import xml.etree.ElementTree as ET
def find_child(node, with_name):
"""Recursively find node with given name"""
for element in list(node):
if element.tag == with_name:
return element
elif list(element):
sub_result = find_child(element, with_name)
if sub_result is not None:
return sub_result
return None
def replace_node(from_tree, to_tree, node_name):
"""
Replace node with given node_name in to_tree with
the same-named node from the from_tree
"""
# Find nodes of given name ('car' in the example) in each tree
from_node = find_child(from_tree.getroot(), node_name)
to_node = find_child(to_tree.getroot(), node_name)
# Find where to substitute the from_node into the to_tree
to_parent, to_index = get_node_parent_info(to_tree, to_node)
# Replace to_node with from_node
to_parent.remove(to_node)
to_parent.insert(to_index, from_node)
def get_node_parent_info(tree, node):
"""
Return tuple of (parent, index) where:
parent = node's parent within tree
index = index of node under parent
"""
parent_map = {c:p for p in tree.iter() for c in p}
parent = parent_map[node]
return parent, list(parent).index(node)
from_tree = ET.ElementTree(file='sample1.xml')
to_tree = ET.ElementTree(file='sample2.xml')
replace_node(from_tree, to_tree, 'car')
# ET.dump(to_tree)
to_tree.write('output.xml')
更新:最近我注意到,如果所讨论的“孩子”不在第一个分支中,那么我最初提供的解决方案中find_child()
的实施将会失败遍历的XML树。我已经更新了上面的实现来纠正这个问题。