在我的脚本中,我尝试向q代理服务器发出请求。 我只是这样做:
import requests
response = requests.get('https://websiteiwhantget', proxies={"http": '176.36.111.9:56323', "https": '176.36.111.9:56323'})
我从https://free-proxy-list.net/获得的代理ip addres,但是当我运行sript时,我在get呼叫中输入的每个网站都有:
引发ProxyError(例如,request = request) requests.exceptions.ProxyError:HTTPSConnectionPool(host ='www.moma.com',port = 443):URL超过最大重试次数:/(由ProxyError('无法连接到代理服务器'引起。',RemoteDisconnected('远程终端关闭连接没有回应',)))
如果我在requests.get中删除了proxies指令,则完成所有操作。 为什么使用代理我的脚本不起作用?是在free-proxy-list中列出的代理错误还是我必须更改我的python调用? 我使用python 3.6
非常感谢 上午
答案 0 :(得分:0)
阅读:https://www.scrapehero.com/how-to-rotate-proxies-and-ip-addresses-using-python-3/
然后尝试:
import requests
import random
from lxml.html import fromstring
url = 'https://free-proxy-list.net/anonymous-proxy.html'
response = requests.get(url)
parser = fromstring(response.text)
proxies = []
for i in parser.xpath('//tbody/tr')[:20]:
if i.xpath('.//td[7][contains(text(),"yes")]'):
proxy = ":".join([i.xpath('.//td[1]/text()')[0], i.xpath('.//td[2]/text()')[0]])
try:
t = requests.get("https://www.google.com/", proxies={"http": proxy, "https": proxy}, timeout=5)
if t.status_code == requests.codes.ok:
proxies.append(proxy)
except:
pass
proxy = proxies[random.randint(0, len(proxies)-1)]
response = requests.get('https://websiteiwhantget', proxies={"http": proxy, "https": proxy})