Scrapy spider跳过页面

时间:2018-06-09 07:10:17

标签: python-2.7 web-scraping scrapy scrapy-spider scraper

我是建造scrapy蜘蛛的新手。我的蜘蛛跳过几页。为什么会这样?

    def parse(self, response):
    global ind

    if(response.xpath('body/@class').extract_first() == 'detailPage en '):
        ind=ind+1
        page_parse(response,ind)
    else:
        next_page=response.xpath('//div[@class="pagination-zone  twelve"]/a[@data-selenium="pn-next"]/@href')
        objectlinks=response.xpath('//h5[@data-selenium="itemHeading"]/a[@class="c5"]/@href').extract()
        for objectlink in objectlinks:
            yield response.follow(str(objectlink), callback=self.parse, priority=2)
        yield response.follow(next_page.extract_first(), callback=self.parse, priority=1)

谢谢!

0 个答案:

没有答案