Scrapy如何从管道中的open_spider中阻止Spider?

时间:2018-07-04 02:54:04

标签: python scrapy

我想在函数open_spider上停止管道中的蜘蛛,但是当我调用raise CloseSpider(reason="record not exists")时,原因未记录在日志文件中。

class ArticlePipeline(object):
    def open_spider(self, spider):
        site = None
        if len(spider.allowed_domains) > 0:
            site = site_model.get_by_hostname(spider.allowed_domains[0])
            if site and site['status'] == 1:
                spider.id = site['id']
                print(site['id'], site['sitename'])

        if not site:
            raise CloseSpider("Spider not in the database %s %s " % (spider.name, spider.allowed_domains[0]))

输出:

D:\mysite\crawler>scrapy crawl air.hk --s LOG_LEVEL=WARNING
Unhandled error in Deferred:
2018-07-04 10:52:33 [twisted] CRITICAL: Unhandled error in Deferred:

2018-07-04 10:52:33 [twisted] CRITICAL:
Traceback (most recent call last):
  File "c:\users\administrator\appdata\local\programs\python\python36-32\lib\site-packages\twisted\internet\defer.py", line 1386, in _inlineCallbacks
    result = g.send(result)
  File "c:\users\administrator\appdata\local\programs\python\python36-32\lib\site-packages\scrapy\crawler.py", line 82, in crawl
    yield self.engine.open_spider(self.spider, start_requests)
scrapy.exceptions.CloseSpider

0 个答案:

没有答案