Python如何从Basic Table获取Scrapy Xpath数据?

时间:2015-03-16 20:04:20

标签: python xpath scrapy

<TABLE>
<br>

    <TR>
    <td width = 270><p align="left" style="margin-left: 0;"><b>Info</b></p></td>
    <td><p>  </p></td>
    </TR>
    <TR>
    <td width = 270><p align="left" style="margin-left: 10;">Page&nbsp;Count</p></td>
    <td><p> =  4 </p></td>
    </TR>
    ...

尝试从上表中获取= 4值的response.xpath。即使在检查Chrome中的元素并以这种方式拉动xpath时,我也会以[]值结束。试过:

/html/body/table[1]/tr[2]/td[2] 
//table[2]/tr[2]/td[2] 

都失败了。

1 个答案:

答案 0 :(得分:2)

我会通过td文字获取Count,然后获取following-sibling

//td[contains(p, "Count")]/following-sibling::td/p/text()

演示:

$ scrapy shell index.html
In [1]: response.xpath('//td[contains(p, "Count")]/following-sibling::td/p/text()').extract()
Out[1]: [u' = 4 ']

如果您想提取实际数字,请使用.re()

In [2]: response.xpath('//td[contains(p, "Count")]/following-sibling::td/p/text()').re(r'(\d+)')
Out[2]: [u'4']