Xpath lxml获取重复子元素的id

时间:2015-11-07 01:03:56

标签: python xpath lxml

这是我的简称来源。

<meeting>
    <race id="204091" number="1" nomnumber="4" division="0" name="FAIRFIELD RSL CLUB HANDICAP" mediumname="BM90" shortname="BM90" stage="Results" distance="1200" minweight="54" raisedweight="0" class="BM90      " age="3U        " grade="0" weightcondition="HCP       " trophy="0" owner="0" trainer="0" jockey="0" strapper="0" totalprize="85000" first="48750" second="16750" third="8350" fourth="4150" fifth="2000" time="2015-08-15T12:35:00" bonustype="BOB7      " nomsfee="0" acceptfee="0" trackcondition="Good 3    " timingmethod="Electronic" fastesttime="1-09.89   " sectionaltime="600/33.95 " formavailable="0" racebookprize="Of $85000. First $48750, second $16750, third $8350, fourth $4150, fifth $2000, sixth $1000, seventh $1000, eighth $1000, ninth $1000, tenth $1000">
    <race id="204092" number="2" nomnumber="5" division="0" name="DOOLEYS HANDICAP" mediumname="BM80" shortname="BM80" stage="Results" distance="1500" minweight="54" raisedweight="0" class="BM80      " age="3U        " grade="0" weightcondition="HCP       " trophy="0" owner="0" trainer="0" jockey="0" strapper="0" totalprize="85000" first="48750" second="16750" third="8350" fourth="4150" fifth="2000" time="2015-08-15T13:15:00" bonustype="BOB7      " nomsfee="0" acceptfee="0" trackcondition="Good 3    " timingmethod="Electronic" fastesttime="1-30.01   " sectionaltime="600/34.27 " formavailable="0" racebookprize="Of $85000. First $48750, second $16750, third $8350, fourth $4150, fifth $2000, sixth $1000, seventh $1000, eighth $1000, ninth $1000, tenth $1000">
    <race id="204093" number="3" nomnumber="2" division="0" name="CANTERBURY HURLSTONE PARK RSL CLUB SPRING PREVIEW" mediumname="SPRINGPREV" shortname="SPRINGPREV" stage="Results" distance="1400" minweight="54" raisedweight="0" class="~         " age="3U        " grade="0" weightcondition="HCP       " trophy="0" owner="0" trainer="0" jockey="0" strapper="0" totalprize="85000" first="48750" second="16750" third="8350" fourth="4150" fifth="2000" time="2015-08-15T13:50:00" bonustype="BOB7      " nomsfee="0" acceptfee="0" trackcondition="Good 3    " timingmethod="Electronic" fastesttime="1-24.61   " sectionaltime="600/33.06 " formavailable="0" racebookprize="Of $85000. First $48750, second $16750, third $8350, fourth $4150, fifth $2000, sixth $1000, seventh $1000, eighth $1000, ninth $1000, tenth $1000">
</meeting>

我的代码为每个race元素循环正确的次数,但它不会将id的检索移动到下一个元素。我如何相对于循环移动树?

from lxml import objectify, etree
from builtins import len

with open('20150815RHIL0.xml', 'r') as f:
    xml = f.read()
    root = objectify.fromstring(xml)

    #===========================================================================
    # print(root.tag)
    # print(root.attrib['date'], root.attrib['weather'])
    # print(root.race.attrib['id'])
    #===========================================================================


    for races in root.race.iter():
        print(root.race.attrib['id'])

2 个答案:

答案 0 :(得分:1)

您需要在循环内使用循环变量。替换:

    var data = JSON.parse(responseBody);
    tests["Pass this case"] = data.id === 11f345f7-9f12-58fb-099d-27f11233cee7;

使用:

for races in root.race.iter():
    print(root.race.attrib['id'])

答案 1 :(得分:1)

仅仅因为你没有得到当前迭代的对象元素,它应该是:

root = objectify.fromstring(xml)

for race in root.meeting.iter('race'):
    print(race.get('id'))

此外,根据您的xml片段,root.race不是有效路径,它应该是root.meeting.race。因为你想抓住所有种族元素,因此:root.meeting.iter('race')