我有这个xml:
<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<SOAP-ENV:Body>
<m:request xmlns:m="http://www.datapower.com/schemas/management" domain="XXXXX">
<m:do-action>
<FlushDocumentCache>
<XMLManager class="XMLManager">default</XMLManager>
</FlushDocumentCache>
<FlushStylesheetCache>
<XMLManager class="XMLManager">default</XMLManager>
</FlushStylesheetCache>
</m:do-action>
</m:request>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>
我想只更改域属性的值XXXXX。
我做了类似的事情:
import xml.etree.ElementTree as etree
tree = etree.parse('input.xml')
# HOW TO FIND THE VALUE XXXXX AND CHANGE IT WITH A NEW VALUE ???
tree.write('output.xml')
感谢。
答案 0 :(得分:0)
几句话:
您将看到解析xml字符串(从文件中)然后将其写入另一个文件,不会产生相同的结果,因为解析器会改变它。您可以通过简单地运行您发布的代码来测试它(显然是第3行):
import xml.etree.ElementTree as etree
tree = etree.parse('input.xml')
tree.write('output.xml')
所有 SOAP-ENV: *节点已转换为 ns0 *, m *节点已转换为 NS1 *。为此,我必须将它们从xml文件复制到代码(soap_env_ns_name
和m_ns_name
变量)中,如下所述:Saving XML using ETree in Python. It's not retaining namespaces, and adding ns0, ns1 and removing xmlns tags。
SOAP-ENC ,默认值( xsi 和 xsd )名称空间已被删除,因为它们未被引用xml。此外, m 已从请求节点移至 Envelope (root)节点;我不确定它是否是标准的一部分,但在大多数XML上,我看到命名空间在根节点中声明。无论如何,这里没有什么可以做的,Python的解析器不是很聪明。
所以,就是这样,代码对XML结构非常紧张(丑陋而不是最丑),如果结构发生变化,代码也需要更新(这里我不是在谈论命名空间的变通方法) ):
@ EDIT1:添加了for
循环来注册命名空间,之前的版本就像我在第二个子弹中描述的那样。但是在运行时,它确实用 Y 替换 X 。
@ EDIT2:注释掉了domain
属性值测试,所以现在无论如何都会改变这个值。
import xml.etree.ElementTree as ET
env_node_name = "Envelope"
body_node_name = "Body"
request_node_name = "request"
domain_attr_name = "domain"
domain_attr_val = "XXXXX"
domain_attr_new_val = "YYYYY"
#Gainarie: those are the namespaces from the xml file
soap_env_ns_name = "SOAP-ENV"
m_ns_name = "m"
#soap_enc_ns_name = "SOAP-ENC"
#xsi_ns_name = "xsi"
#xsd_ns_name = "xsd"
namespaces_dict = {
soap_env_ns_name: "http://schemas.xmlsoap.org/soap/envelope/",
m_ns_name: "http://www.datapower.com/schemas/management",
# Those are simply ignored by the parser as they're not referenced in our xml.
#"SOAP-ENC": "http://schemas.xmlsoap.org/soap/encoding/",
#"xsi": "http://www.w3.org/2001/XMLSchema-instance",
#"xsd": "http://www.w3.org/2001/XMLSchema",
}
def tag(ns, name):
return "{" + ns + "}" + name
for key in namespaces_dict.keys():
ET.register_namespace(key, namespaces_dict[key])
tree = ET.parse("input.xml")
root = tree.getroot()
env_gen = root.iter(tag(namespaces_dict[soap_env_ns_name], env_node_name))
try:
for env in env_gen:
body_gen = env.iter(tag(namespaces_dict[soap_env_ns_name], body_node_name))
try:
for body in body_gen:
request_gen = body.iter(tag(namespaces_dict[m_ns_name], request_node_name))
try:
for request in request_gen:
if domain_attr_name in request.keys():
# Now, I didn't fully understand the question:
# you want to change the value of the 'domain' attribute (in your xml example: "XXXXX") to - let's say - "YYYYY" (as my code does) on one of the 2 below cases:
# 1: change it only if current value is "XXXXX"
# 2: change it regardless of the current value
# if it's 1, then that's OK, but if it's 2, you'll have to comment the very below 'if domain_attr_val ...' line (prepend it by a # - just like the current one)
#if domain_attr_val == request.get(domain_attr_name):
request.set(domain_attr_name, domain_attr_new_val)
except StopIteration:
print "Done iterating on '%s' node" % request_node_name
except StopIteration:
print "Done iterating on '%s' node" % body_node_name
except StopIteration:
print "Done iterating on '%s' node" % env_node_name
tree.write("output.xml")