我的midwares.py中有here提到的以下代码,我试图在每次请求的TOR中更改我的IP
def _set_new_ip():
with Controller.from_port(port=9051) as controller:
controller.authenticate(password='tor_password')
controller.signal(Signal.NEWNYM)
class RandomUserAgentMiddleware(object):
def process_request(self, request, spider):
ua = random.choice(settings.get('USER_AGENT_LIST'))
if ua:
request.headers.setdefault('User-Agent', ua)
class ProxyMiddleware(object):
def process_request(self, request, spider):
_set_new_ip()
request.meta['proxy'] = 'http://127.0.0.1:8118'
spider.log('Proxy : %s' % request.meta['proxy'])
但是当我尝试在scrapy中开始爬行时,它会不断回复我:
2017-09-10 22:36:44 [scrapy.extensions.telnet] DEBUG: Telnet console listening on 127.0.0.1:6023
2017-09-10 22:36:44 [stem] DEBUG: GETCONF __owningcontrollerprocess (runtime: 0.0004)
2017-09-10 22:36:44 [stem] INFO: Error while receiving a control message (SocketClosed): empty socket content
2017-09-10 22:36:44 [IT] DEBUG: Proxy : http://127.0.0.1:8118
2017-09-10 22:36:44 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.jobstreet.com.sg/en/job-search/job-vacancy.php?ojs=10&key=information%20technology> (failed 1 times): Connection was refused by other side: 61: Connection refused.
2017-09-10 22:36:44 [stem] DEBUG: GETCONF __owningcontrollerprocess (runtime: 0.0003)
2017-09-10 22:36:44 [stem] INFO: Error while receiving a control message (SocketClosed): empty socket content
2017-09-10 22:36:44 [IT] DEBUG: Proxy : http://127.0.0.1:8118
2017-09-10 22:36:52 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.jobstreet.com.sg/en/job-search/job-vacancy.php?ojs=10&key=information%20technology> (failed 2 times): Connection was refused by other side: 61: Connection refused.
2017-09-10 22:36:52 [stem] DEBUG: GETCONF __owningcontrollerprocess (runtime: 0.0004)
2017-09-10 22:36:52 [stem] INFO: Error while receiving a control message (SocketClosed): empty socket content
2017-09-10 22:36:52 [IT] DEBUG: Proxy : http://127.0.0.1:8118
2017-09-10 22:36:56 [scrapy.downloadermiddlewares.retry] DEBUG: Gave up retrying <GET https://www.jobstreet.com.sg/en/job-search/job-vacancy.php?ojs=10&key=information%20technology> (failed 3 times): Connection was refused by other side: 61: Connection refused.
2017-09-10 22:36:56 [scrapy.core.scraper] ERROR: Error downloading <GET https://www.jobstreet.com.sg/en/job-search/job-vacancy.php?ojs=10&key=information%20technology>: Connection was refused by other side: 61: Connection refused.
2017-09-10 22:36:56 [scrapy.core.engine] INFO: Closing spider (finished)