我有一个特殊的xml文件,如下所示:
<alarm-dictionary source="DDD" type="ProxyComponent">
<alarm code="402" severity="Alarm" name="DDM_Alarm_402">
<message>Database memory usage low threshold crossed</message>
<description>dnKinds = database
type = quality_of_service
perceived_severity = minor
probable_cause = thresholdCrossed
additional_text = Database memory usage low threshold crossed
</description>
</alarm>
...
</alarm-dictionary>
我知道在python中,我可以通过标记警告获取“警报代码”,“严重性”:
for alarm_tag in dom.getElementsByTagName('alarm'):
if alarm_tag.hasAttribute('code'):
alarmcode = str(alarm_tag.getAttribute('code'))
我可以在标记消息中获取文字,如下所示:
for messages_tag in dom.getElementsByTagName('message'):
messages = ""
for message_tag in messages_tag.childNodes:
if message_tag.nodeType in (message_tag.TEXT_NODE, message_tag.CDATA_SECTION_NODE):
messages += message_tag.data
但我也希望得到值,例如 dnkind (数据库),类型(quality_of_service), perceived_severity (thresholdCrossed)和 probable_cause (数据库内存使用率低阈值越过 )在标签 description 。
也就是说,我也想在xml中解析标签中的内容。
有人可以帮我吗? 非常感谢!
答案 0 :(得分:4)
从description
标签获得文本后,它与XML解析无关。你只需要做简单的字符串解析就可以将type = quality_of_service
键/值字符串变成更好的东西,比如字典就可以在Python中使用
通过ElementTree进行一些稍微简单的解析,它看起来像这样
messages = """
<alarm-dictionary source="DDD" type="ProxyComponent">
<alarm code="402" severity="Alarm" name="DDM_Alarm_402">
<message>Database memory usage low threshold crossed</message>
<description>dnKinds = database
type = quality_of_service
perceived_severity = minor
probable_cause = thresholdCrossed
additional_text = Database memory usage low threshold crossed
</description>
</alarm>
...
</alarm-dictionary>
"""
import xml.etree.ElementTree as ET
# Parse XML
tree = ET.fromstring(messages)
for alarm in tree.getchildren():
# Get code and severity
print alarm.get("code")
print alarm.get("severity")
# Grab description text
descr = alarm.find("description").text
# Parse "thing=other" into dict like {'thing': 'other'}
info = {}
for dl in descr.splitlines():
if len(dl.strip()) > 0:
key, _, value = dl.partition("=")
info[key.strip()] = value.strip()
print info
答案 1 :(得分:2)
我不太确定Python,但经过快速研究后。
看到你已经可以从XML中的description标签中获取所有内容,你是不是可以通过换行符拆分,然后使用等号上的str.split()函数拆分每一行来给你命名/价值分开?
e.g。
for messages_tag in dom.getElementsByTagName('message'):
messages = ""
for message_tag in messages_tag.childNodes:
if message_tag.nodeType in (message_tag.TEXT_NODE, message_tag.CDATA_SECTION_NODE):
messages += message_tag.data
tag = str.split('=');
tagName = tag[0]
tagValue = tag[1]
(我没有考虑将每一行拆分并循环)
但这应该让你走上正轨:)
答案 2 :(得分:2)
AFAIK没有库可以将文本作为DOM
元素处理。
但是,您可以(在message
变量中有消息后)执行:
description = {}
messageParts = message.split("\n")
for part in messageParts:
descInfo = part.split("=")
description[descInfo[0].strip()] = descInfo[1].strip()
然后您将以description
地图的形式在key-value
内找到所需信息。
您还应该在我的代码上添加错误处理...