我有这种xml格式.....
<event timestamp="0.447463" bustype="LIN" channel="LIN 1">
<col name="Time"/>
<col name="Start of Frame">0.440708</col>
<col name="Channel">LIN 1</col>
<col name="Dir">Tx</col>
<col name="Event Type">LIN Frame (Diagnostic Request)</col>
<col name="Frame Name">MasterReq_DB</col>
<col name="Id">3C</col>
<col name="Data">81 06 04 04 FF FF 50 4C</col>
<col name="Publisher">TestMaster (simulated)</col>
<col name="Checksum">D3 "Classic"</col>
<col name="Header Duration">2.090 ms (40.1 bits)</col>
<col name="Resp. Duration">4.688 ms (90.0 bits)</col>
<col name="Time difference">0.049987</col>
<empty/>
</event>
在上面的xml中,我需要提取与属性“name”相关的数据
能够获取所有名称但无法获取&gt; MasterReq_DB&lt;场
请帮帮我......
提前致谢
我的python代码是......
import sys
import array
import string
from xml.dom.minidom import parse,parseString
from xml.dom import minidom
input_file = open("test_input.txt",'r')
alines = input_file.read()
word_lst = alines.split("'")
filename = word_lst[1]
pathname=word_lst[3]
f = open(pathname,'r')
doc = minidom.parse(f)
node = doc.documentElement
events = doc.getElementsByTagName('event')
for event in events:
#print (event)
columns = event.getElementsByTagName('col')
for column in columns:
#print (column)
head = column.getAttribute('name')
if (head == ('Frame Name')):
print (head)
request = head.firstChild.wholeText
print (request)
print ("DOne")
答案 0 :(得分:1)
如果您愿意,可以使用lxml
来开始使用{<1}}。
In [1]: x = '''<event timestamp="0.447463" bustype="LIN" channel="LIN 1">
...: <col name="Time"/>
...: <col name="Start of Frame">0.440708</col>
...: <col name="Channel">LIN 1</col>
...: <col name="Dir">Tx</col>
...: <col name="Event Type">LIN Frame (Diagnostic Request)</col>
...: <col name="Frame Name">MasterReq_DB</col>
...: <col name="Id">3C</col>
...: <col name="Data">81 06 04 04 FF FF 50 4C</col>
...: <col name="Publisher">TestMaster (simulated)</col>
...: <col name="Checksum">D3 "Classic"</col>
...: <col name="Header Duration">2.090 ms (40.1 bits)</col>
...: <col name="Resp. Duration">4.688 ms (90.0 bits)</col>
...: <col name="Time difference">0.049987</col>
...: <empty/>
...: </event> '''
In [2]: from lxml import etree
In [3]: tree = etree.fromstring(x)
In [4]: [elem.text for elem in tree.xpath('//*[@name]')]
Out[4]:
[None,
'0.440708',
'LIN 1',
'Tx',
'LIN Frame (Diagnostic Request)',
'MasterReq_DB',
'3C',
'81 06 04 04 FF FF 50 4C',
'TestMaster (simulated)',
'D3 "Classic"',
'2.090 ms (40.1 bits)',
'4.688 ms (90.0 bits)',
'0.049987']
In [5]: [name for name in tree.xpath('//@name')]
Out[5]:
['Time',
'Start of Frame',
'Channel',
'Dir',
'Event Type',
'Frame Name',
'Id',
'Data',
'Publisher',
'Checksum',
'Header Duration',
'Resp. Duration',
'Time difference']
要从文件而不是字符串中读取,请使用lxml.etree.parse
函数。
这是lxml
tutorial的链接。这是XPath syntax的参考。