使用CSS使用scrapy抓取下一页

时间:2019-08-23 16:40:51

标签: scrapy web-crawler

我要删除zomato页面,我需要下一页的项目名称和描述。我对CSS标记感到满意,因此可以使用它们。我已经创建了另一个函数parsse_next来执行此操作,但无法找到我应该在其中写的逻辑。我对scrapy还是陌生的。我需要为餐厅名称编写类似的内容。

def parse(self, response):
        rest=response.css(".result-order-flow-title.hover_feedback.zred.bold.fontsize0.ln20::attr(title)").extract() 
        for restaurant  in zip(rest):
            scrapped_info={
            'restaurant':restaurant[0],
            } 

            yield scrapped_info
        nextpage=response.css('.result-order-flow-title.hover_feedback.zred.bold.fontsize0.ln20::attr(href)').extract()
        if nextpage is not None:
            yield scrapy.Request(response.urljoin(nextpage),callback=self.parsenext)
    def parsenext(self,response):

1 个答案:

答案 0 :(得分:0)

def parse(self, response):
    rest=response.css(".result-order-flow-title.hover_feedback.zred.bold.fontsize0.ln20::attr(title)").extract() 
    for restaurant  in zip(rest):
        scrapped_info={
        'restaurant':restaurant[0],
        } 

        yield scrapped_info
    nextpage=response.css('.result-order-flow-title.hover_feedback.zred.bold.fontsize0.ln20::attr(href)').extract()
    if nextpage is not None:
        yield scrapy.Request(response.urljoin(nextpage),callback=self.parse)