Question

我正在使用Scrapy和几个Spiders，并且需要自定义json输出，其中包括一些Spider统计信息（成功请求列表，错误列表等）。我已经制作了自定义项目管道，但我不知道如何从那里访问统计数据。到目前为止，这是我的管道代码：

class JsonWithEncodingPipeline(object):

    def open_spider(self, spider):
        self.file = codecs.open(spider.output_path, 'w', encoding='utf-8')

    def process_item(self, item, spider):
        line = json.dumps(dict(item), ensure_ascii=False, indent=2) + "\n"
        self.file.write(line)
        return item

    def spider_closed(self, spider):
        self.file.close()

Answer 1

您可以访问以下统计信息：

class MyPipeline:

    def __init__(self, stats):
        self.stats = stats

    @classmethod
    def from_crawler(cls, crawler):
        return cls(crawler.stats)

是否可以从Scrapy中的特定蜘蛛管道访问统计数据？

1 个答案: