应用错误收集

我是建造scrapy蜘蛛的新手。我的蜘蛛跳过几页。为什么会这样？

    def parse(self, response):
    global ind

    if(response.xpath('body/@class').extract_first() == 'detailPage en '):
        ind=ind+1
        page_parse(response,ind)
    else:
        next_page=response.xpath('//div[@class="pagination-zone  twelve"]/a[@data-selenium="pn-next"]/@href')
        objectlinks=response.xpath('//h5[@data-selenium="itemHeading"]/a[@class="c5"]/@href').extract()
        for objectlink in objectlinks:
            yield response.follow(str(objectlink), callback=self.parse, priority=2)
        yield response.follow(next_page.extract_first(), callback=self.parse, priority=1)

谢谢！

Scrapy spider跳过页面

0 个答案: