Scrapy:使用' role'选择表格行。

时间:2017-07-07 16:50:58

标签: python scrapy

enter image description here

我试图选择一个看起来像的表格行(也是截图):

Dim results As IEnumerable(Of Offer) =
      From offer In xdoc.Descendants(myns + "PricedOffer")
      Where offer.Descendants(myns +"FlightSegmentReference")
                  .Any(Function(e) e.Attribute("ref").Value = inFlightNo)
      Select New Offer() With {
                          .FlightRef = inFlightNo,
                          .Price = offer.Descendants(myns +"SimpleCurrencyPrice").FirstOrDefault().Value
                          .OfferItemID =offer.Element(myns +"OfferPrice").Attribute("OfferItemID").Value 
}

我试过了:

<tr data-uid="65724478-5102-4fa3-8de1-17b54dd7909c" role="row"><td role="gridcell"><input id="tesT1" name="tesT1" onchange="SelectedYearDetailck({id:'R12684_2016',TaxDue:231.33,TaxYr:2016,AgEntityID:072000})" type="checkbox" value="true" checked="checked"><input name="tesT1" type="hidden" value="false"><label for="tesT1" class="cklabel"></label><input type="hidden" id="ck_2016" value="ck_R12684"><div id="grid-wait_R12684" style="display:none;" class="grid-wait"><img src="../Content/Images/25(1).GIF"></div></td><td role="gridcell">2016</td><td style="text-align:right;" role="gridcell">$231.33</td><td style="text-align:right;" role="gridcell">$327.58</td></tr>

如何使用其角色属性选择此行?

1 个答案:

答案 0 :(得分:1)

像这样:

>>> from scrapy.selector import Selector
>>> HTML = '''\
... <tr data-uid="65724478-5102-4fa3-8de1-17b54dd7909c" role="row">
...     <td role="gridcell">...</td>
...     <td role="gridcell">2016</td>
... </tr>'''
>>> selector = Selector(text=HTML)
>>> selector.xpath('.//tr[@role="row"]').extract()
['<tr data-uid="65724478-5102-4fa3-8de1-17b54dd7909c" role="row">\n\t<td role="gridcell">...</td>\n\t<td role="gridcell">2016</td>\n</tr>']