我正在尝试删除此页面: https://answers.yahoo.com/question/index?qid=20151012004431AAyDFwK
我做得很好,但现在我需要继续下一页做同样的事情,由链接“Next>”引用在页面的右上角。
我使用此代码从xpath获取链接并调用parse方法。
next_page = hxs.xpath('((//a[contains(@class,"Clr-b")])[3])/@href').extract()
composed_string = "https://answers.yahoo.com" + next_page[0]
URL = response.urljoin(composed_string)
print("NEXT ->" + composed_string)
yield scrapy.Request(URL, callback=self.parse_page)