Question

当我使用scrapy-redis时会设置蜘蛛DontCloseSpider。如何知道scrapy爬行完成。

crawler.signals.connect（ext.spider_closed，signal = signals.spider_closed）无法正常工作

Answer 1

有趣。

我看到这个评论：

# Max idle time to prevent the spider from being closed when distributed crawling.
# This only works if queue class is SpiderQueue or SpiderStack,
# and may also block the same time when your spider start at the first time (because the queue is empty).
SCHEDULER_IDLE_BEFORE_CLOSE = 10

如果你正确地按照设置说明进行操作并且它不起作用，我想至少你必须提供一些允许重现你的设置的数据，例如你的settings.py或者你有任何有趣的蜘蛛/管道。

确实应该发生

spider_closed信号。在队列中的URL耗尽之后几秒钟。如果队列不为空，蜘蛛就不会关闭 - 显然。

如何知道scrapy-redis完成

1 个答案: