Question

我试图使用带有scrapy的中间件，所以在我的项目名为＆＃34; Tutorial＆＃34;我这样做了：

在我添加的设置文件中：

DOWNLOADER_MIDDLEWARES = {
'scrapy.contrib.downloadermiddleware.httpproxy.HttpProxyMiddleware': 110,
'tutorial.middlewares.ProxyMiddleware': 100,
}

还要创建一个名为middlewares.py的文件，其中包含：

import base64

# Start your middleware class
class ProxyMiddleware(object):
  # overwrite process request
  def process_request(self, request, spider):
    # Set the location of the proxy
    request.meta['proxy'] = "39.179.187.48:8123"

当我尝试在shell中运行项目时

scrapy shell http://google.com

我收到以下错误：

file "/usr/local/lib/python2.7/dist-packages/twisted/internet/threads.py", line 122, in blockingCallFromThread
    result.raiseException()
  File "<string>", line 2, in raiseException
TypeError: argument of type 'NoneType' is not iterable

Answer 1

根据documentation：

process_request()应该：返回None，返回Response 对象，返回一个Request对象，或者引发IgnoreRequest。

您没有从自定义中间件process_request()方法返回：

class ProxyMiddleware(object):
    def process_request(self, request, spider):
        request.meta['proxy'] = "39.179.187.48:8123"
        return request

在此返回request，假设您要使用proxy集重新安排请求。

Scrapy中间件＆＃39; NoneType＆＃39;是不可迭代的

1 个答案: