不适当的部署Scrapy代理

时间:2017-06-16 04:07:48

标签: python selenium proxy web-scraping scrapy

我正在抓取个人资料时收到错误消息。我假设我使用我的代理错了。但这里的主要错误是什么?你能帮帮忙吗

  

2017-06-15 21:35:17 [scrapy.proxies]信息:删除失败的代理   ,12个代理人离开2017-06-15   21:35:17 [scrapy.core.scraper]错误:下载https://www.linkedin.com/in/jiajie-jacky-fan-80920083/>时出错追溯   (最近一次调用最后一次):文件   “/Users/jiajiefan/data_mining/lib/python2.7/site-packages/twisted/internet/defer.py”   第1299行,在_inlineCallbacks中       result = result.throwExceptionIntoGenerator(g)文件“/Users/jiajiefan/data_mining/lib/python2.7/site-packages/twisted/python/failure.py”,   第393行,在throwExceptionIntoGenerator中       return g.throw(self.type,self.value,self.tb)File“/Users/jiajiefan/data_mining/lib/python2.7/site-packages/Scrapy-1.4.0-py2.7.egg/scrapy/核心/下载/ middleware.py”   第43行,在process_request中       defer.returnValue((yield download_func(request = request,spider = spider)))文件   “/Users/jiajiefan/data_mining/lib/python2.7/site-packages/Scrapy-1.4.0-py2.7.egg/scrapy/utils/defer.py”   第45行,在mustbe_deferred中       result = f(* args,** kw)File“/Users/jiajiefan/data_mining/lib/python2.7/site-packages/Scrapy-1.4.0-py2.7.egg/scrapy/core/downloader/handlers/的初始化 py”为,   第65行,在download_request中       return handler.download_request(request,spider)File“/Users/jiajiefan/data_mining/lib/python2.7/site-packages/Scrapy-1.4.0-py2.7.egg/scrapy/core/downloader/handlers/http11。 PY”,   第63行,在download_request中       return agent.download_request(request)File“/Users/jiajiefan/data_mining/lib/python2.7/site-packages/Scrapy-1.4.0-py2.7.egg/scrapy/core/downloader/handlers/http11.py” ,   第272行,在download_request中       agent = self._get_agent(request,timeout)File“/Users/jiajiefan/data_mining/lib/python2.7/site-packages/Scrapy-1.4.0-py2.7.egg/scrapy/core/downloader/handlers/http11 py”为,   第252行,在_get_agent中       _,_,proxyHost,proxyPort,proxyParams = _parse(proxy)File“/Users/jiajiefan/data_mining/lib/python2.7/site-packages/Scrapy-1.4.0-py2.7.egg/scrapy/core/downloader /webclient.py“,第37行,在_parse中       return _parsed_url_args(已解析)文件“/Users/jiajiefan/data_mining/lib/python2.7/site-packages/Scrapy-1.4.0-py2.7.egg/scrapy/core/downloader/webclient.py”,第21行,在_parsed_url_args中       port = parsed.port文件“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urlparse.py”,   113号线,在港口       port = int(port,10)ValueError:无效的文字fo   r int()with base 10:'178.32.255.199'

1 个答案:

答案 0 :(得分:0)

代理服务器应该包含' http'等:

rq.meta['proxy'] = 'http://127.0.0.1:8123'