使用以下Python代码我想解析一个xml文件。您可以在代码下方看到xml文件的摘录。我需要“提取”“inv:name =”后面的所有内容,例如“'数据源屋顶高度'和(value = 1000或value = 2000或value = 3000或value = 4000或value = 5000或value = 6000)”。有任何想法吗?
我的Python代码(到目前为止):
from lxml import etree
doc = etree.parse("data.xml")
for con in doc.xpath("//specification"):
for cons in con.xpath("./@body"):
with open("output.txt", "w") as cons_out:
cons_out.write(cons)
cons_out.close()
xml文件的一部分:
<ownedRule xmi:type="uml:Constraint" xmi:id="EAID_OR000004_EE68_4efa_8E1B_8DDFA8F95FB8" name="datasource roof height">
<constrainedElement xmi:idref="EAID_94F3B0A6_EE68_4efa_8E1B_8DDFA8F95FB8"/>
<specification xmi:type="uml:OpaqueExpression" xmi:id="EAID_COE000004_EE68_4efa_8E1B_8DDFA8F95FB8" body="inv: name = 'datasource roof height' and (value = 1000 or value = 2000 or value = 3000 or value = 4000 or value = 5000 or value = 6000)"/>
</ownedRule>
答案 0 :(得分:0)
XML Parsers理解属性和元素。这些属性或元素(文本内容)中的内容与XML解析器无关。
为了解决您的问题,您需要拆分从 body 属性中检索到的字符串。当然,我假设所有元素的 body 属性都具有相同的格式内容,即&#34; inv:name = some content&#34;
from lxml import etree
doc = etree.parse("data.xml")
for con in doc.xpath("//specification"):
for cons in con.xpath("./@body"):
with open("output.txt", "w") as cons_out:
content = cons.split("inv: name =")[1]
cons_out.write(content)
cons_out.close()