我正在抓一个结构简单的页面,我使用chrome来了解我需要使用的xpath,但是在这种情况下不起作用。
我有这种xpath:
/html/body/text()[1]
/html/body/div[9]/p/span[2]/text()
但是当我尝试时:
response.xpath('/html/body/div[9]/p/span[2]/text()')
或
response.xpath('/html/body/div[9]/p/span[2]/text()').extract()
我没有得到任何回复,只是一个空列表
答案 0 :(得分:1)
您需要修复XPath表达式。来自壳牌的演示:
$ scrapy shell "http://www.bbb.org/boston/business-reviews/appliances-major-dealers/dracut-appliance-center-inc-in-dracut-ma-76793/ReadReviews?page=1&exp=1" -s USER_AGENT="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.116 Safari/537.36"
>>> print(response.xpath("//span[. = 'Comment from the Business']/following-sibling::span/text()").extract_first())
Mr, ********,
Thank you very much for your positive review. It's great to hear your install went smoothly. *** (our sales manager of over 45 years) and *** (Sales for over 10 years) have been notified of this positive response and truly appreciated it. We look forward to service you again in the future!!