Scrapy禁用重试中间件

时间:2019-08-09 08:50:57

标签: python web-scraping scrapy

我评论了settings.py中的这一行,但它继续被启用。

DOWNLOADER_MIDDLEWARES = {  
       #'scrapy.downloadermiddlewares.retry.RetryMiddleware': 90,
       'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware': 110,
    }

在程序开始时,它将加载许多我未启用的中间件

2019-08-09 10:43:37 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
 'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
 'scrapy.downloadermiddlewares.retry.RetryMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']

我想念什么吗?有办法禁用它吗?

1 个答案:

答案 0 :(得分:2)

根据文档,DOWNLOADER_MIDDLEWARES is merged with DOWNLOADER_MIDDLEWARES_BASE。在后者中,默认情况下启用选项scrapy.downloadermiddlewares.httpproxy.RetryMiddleware

所以要么写

DOWNLOADER_MIDDLEWARES = {  
       'scrapy.downloadermiddlewares.retry.RetryMiddleware': None,
       #                                                     ^^^
       'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware': 110,
    }

或查看DOWNLOADER_MIDDLEWARES_BASE。参见their documentation for more details