Scrapy:在方法之间传递项目

时间:2013-12-18 16:18:15

标签: python scrapy

假设我有一个Bookitem,我需要在解析阶段和详细阶段添加信息

def parse(self, response)
    data = json.loads(response)
    for book in data['result']:
        item = BookItem();
        item['id'] = book['id']
        url = book['url']
        yield Request(url, callback=self.detail)

def detail(self,response):        
    hxs = HtmlXPathSelector(response)
    item['price'] = ......
#I want to continue the same book item as from the for loop above

按原样使用代码会导致详细信息阶段中的未定义项。如何将项目传递给细节?细节(自我,回应,项目)似乎不起作用。

2 个答案:

答案 0 :(得分:31)

请求名为meta的参数:

yield Request(url, callback=self.detail, meta={'item': item})

然后在函数detail中,以这种方式访问​​它:

item = response.meta['item']

查看有关职位主题的更多详情here

答案 1 :(得分:3)

您可以在 init 方法中定义变量:

class MySpider(BaseSpider):
    ...

    def __init__(self):
        self.item = None

    def parse(self, response)
        data = json.loads(response)
        for book in data['result']:
            self.item = BookItem();
            self.item['id'] = book['id']
            url = book['url']
            yield Request(url, callback=self.detail)

    def detail(self, response):        
        hxs = HtmlXPathSelector(response)
        self.item['price'] = ....