我有一个xml文件,如下所示:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="pdml2html.xsl"?>
<pdml version="0" creator="wireshark/1.12.4" time="Sat Aug 15 11:18:38 2010" capture_file="">
<field name="gsm_a.dtap.msg_rr_type" showname="Message Type: System Information Type 1" size="1" pos="60" show="25" value="19"/>
<field name="" show="Cell Channel Description" size="16" pos="61" value="0000">
<field name="" show="List of ARFCNs = 1 2 3" size="16" pos="61" value="00000000480000000000000100000000"/>
</field>
<field name="gsm_a.dtap.msg_rr_type" showname="Message Type: System Information Type 2" size="1" pos="60" show="26" value="1a"/>
<field name="" show="Neighbour Cell Description - BCCH Frequency List" size="16" pos="61" value="18000002400200000001002061900000">
<field name="gsm_a.rr.ext_ind" showname="..0. .... = EXT-IND: The information element carries the complete BA (0)" size="1" pos="61" show="0" value="0" unmaskedvalue="18"/>
<field name="gsm_a.rr.format_id" showname="00.. 100. = Format Identifier: bit map 0 (0x04)" size="1" pos="61" show="4" value="4" unmaskedvalue="18"/>
<field name="" show="List of ARFCNs = 5 6 7 8 9" size="16" pos="61" value="18"/>
</field>
</pdml>
我使用Python和cElementTree来提取信息。
from xml.etree import cElementTree as ET
e = ET.parse(sys.argv[1]).getroot()
bts_id = {'A': '0', 'B': '0'}
for atype in e.iter('field'):
if atype.attrib['name'] == "gsm_a.dtap.msg_rr_type" and atype.attrib['value'] == "19": #System Information Type 1
for atype in e.iter('field'):
if "List of ARFCNs =" in atype.attrib['show']:
bts_id['A'] = atype.attrib['show'].split("= ")[1]
break
if atype.attrib['name'] == "gsm_a.dtap.msg_rr_type" and atype.attrib['value'] == "1a": #System Information Type 2
for atype in e.iter('field'):
if "List of ARFCNs =" in atype.attrib['show']:
bts_id['B'] = atype.attrib['show'].split("= ")[1]
break
我想在两个字段“show”中提取信息:show =“ARFCNs列表= 1 2 3”和 show =“ARFCNs列表= 5 6 7 8 9”。所以我需要:A =“1 2 3”,B =“5 6 7 8 9”。我的代码总是提取第一个列表,因为我不知道如何在“if”语句之后继续迭代最外层循环中的“字段”。问题还在于,“ARFCN列表”出现的行中没有唯一属性。唯一属性只是上面的一些行。 “系统信息类型1”和类型2没有必要按照上面的顺序出现在xml文件中,但“for”循环应该继续迭代并且永远不会返回到第一个元素,但它应该检查所有“if “每个迭代的”field“标记的语句(xml文件中与”if“语句匹配的行可以按随机顺序出现)。
好的,我简化了问题:
for a in range(50): #loop1
if a=5:
#here need to continue iterating loop1- how?
if a=8:
print "found 8"
if a=22:
#here need to continue iterating loop1- how?
if a=28:
print "found 28"
if a=12:
#here need to continue iterating loop1- how?
if a=15:
print "found 15"
if a=10:
#here need to continue iterating loop1- how?
if a=28:
print "again found 28"
## expected result:
#found 8
#found 28
#found 15
#again found 28
## in real life range(50) can be any list, not necessary numbers
答案 0 :(得分:0)
这有帮助吗? 至少我得到
1 2 3
5 6 7 8 9
1 2 3
5 6 7 8 9
所以你现在需要在正确的地方休息。
from xml.etree import cElementTree as ET
e = ET.parse("test.xml").getroot()
bts_id = {'A': '0', 'B': '0'}
for atype in e.iter('field'):
if atype.attrib['name'] == "gsm_a.dtap.msg_rr_type" and atype.attrib['value'] == "19": #System Information Type 1
for atype in e.iter('field'):
if "List of ARFCNs =" in atype.attrib['show']:
print atype.attrib['show'].split("= ")[1]
elif atype.attrib['name'] == "gsm_a.dtap.msg_rr_type" and atype.attrib['value'] == "1a": #System Information Type 2
for atype in e.iter('field'):
if "List of ARFCNs =" in atype.attrib['show']:
print atype.attrib['show'].split("= ")[1]