Question

尝试从包含多个页面的网站中抓取数据。但是，运行此代码后我什么都没得到。可能有一些遗漏点我无法指出。你想指出所以我可以从print语句中获得输出。这是蜘蛛文件：

# -*- coding: utf-8 -*-
from scrapy import Spider
from scrapy.http import Request


class SaabSpider(Spider):
name = 'saab'
allowed_domains = ['thesaabsite.com/parts_om.php']
start_urls = ['http://www.thesaabsite.com/parts_om.php']

def parse(self, response):
    catagories=response.xpath('//ul[@class="nav col-md-offset-4 col-md-4"]/li/a/text()').extract()
    year_page_url=response.xpath('//ul[@class="nav col-md-offset-4 col-md-4"]/li/a/@href').extract()
    for j in year_page_url:
        absolute_url=response.urljoin(j)
        yield Request(absolute_url,callback=self.model_page)




def model_page(self,response):
    year=response.xpath('//li[@class="tab-pane first text-center"]/a/text()').extract()
    year_url=response.xpath('//li[@class="tab-pane first text-center"]/a/@href').extract()
    for y in year:
        print '\n'
        print y

def main_part(self,response):
    pass

在命令提示符下运行它之后。我得到了这个输出！

enter image description here

如何从下一页获取数据，然后在Scrapy中转移到另一个数据？

0 个答案: