我试图使用带有scrapy的中间件,所以在我的项目名为" Tutorial"我这样做了:
在我添加的设置文件中:
DOWNLOADER_MIDDLEWARES = {
'scrapy.contrib.downloadermiddleware.httpproxy.HttpProxyMiddleware': 110,
'tutorial.middlewares.ProxyMiddleware': 100,
}
还要创建一个名为middlewares.py的文件,其中包含:
import base64
# Start your middleware class
class ProxyMiddleware(object):
# overwrite process request
def process_request(self, request, spider):
# Set the location of the proxy
request.meta['proxy'] = "39.179.187.48:8123"
当我尝试在shell中运行项目时
scrapy shell http://google.com
我收到以下错误:
file "/usr/local/lib/python2.7/dist-packages/twisted/internet/threads.py", line 122, in blockingCallFromThread
result.raiseException()
File "<string>", line 2, in raiseException
TypeError: argument of type 'NoneType' is not iterable
答案 0 :(得分:2)
process_request()
应该:返回None,返回Response 对象,返回一个Request对象,或者引发IgnoreRequest。
您没有从自定义中间件process_request()
方法返回:
class ProxyMiddleware(object):
def process_request(self, request, spider):
request.meta['proxy'] = "39.179.187.48:8123"
return request
在此返回request
,假设您要使用proxy
集重新安排请求。