抓取问题:悬而未决的问题

时间:2018-11-12 15:06:03

标签: python-requests

我正在尝试使用请求时设置随机代理,但是遇到了一些问题。这是我的代码:

import requests
import random
pool = ['220.186.175.252:4216','106.110.39.106:4232']
proxy={'https':random.choice(pool)}
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36'}
test_url = "http://httpbin.org/ip" # a url test ip
response = requests.get(url=test_url,headers=headers,proxies=proxy)
text = response.text
print(text)

和结果:

{"origin": "112.10.164.203"}

它不起作用,所以我尝试更改代理,我想也许它使用http而不是https,我将代理更改为:

proxy={'https':random.choice(pool)}

不幸的是,我得到一个错误:

requests.exceptions.ProxyError: HTTPConnectionPool(host='106.110.39.106', port=4232): Max retries exceeded with url: http://httpbin.org/ip (Caused by ProxyError('Cannot connect to proxy.', RemoteDisconnected('Remote end closed connection without response',)))

所以,我有两个问题: 1,如何在请求中设置随机代理 2.为什么我更改代理会收到此错误

如果您能解决我的问题,我很高兴!

1 个答案:

答案 0 :(得分:1)

您做对了,为什么会出现此错误,是因为您的代理不支持http请求。使用它之前,您需要知道它支持哪种协议。 free-proxy-list

这就是我定义随机代理的方式

import requests
import random
https = ['220.186.175.252:4216','106.110.39.106:4232']
http = ["169.50.180.250:3128"]
proxy={'https':random.choice(https),"http":random.choice(http)}
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36'}
test_url = "http://httpbin.org/ip" # a url test ip
response = requests.get(url=test_url,headers=headers,proxies=proxy)
text = response.text
print(text)