带有代理的python请求

时间:2019-11-23 14:54:04

标签: python-3.x web-scraping proxy

在我的脚本中,我尝试向q代理服务器发出请求。 我只是这样做:

import requests

response = requests.get('https://websiteiwhantget', proxies={"http": '176.36.111.9:56323', "https": '176.36.111.9:56323'})

我从https://free-proxy-list.net/获得的代理ip addres,但是当我运行sript时,我在get呼叫中输入的每个网站都有:

  

引发ProxyError(例如,request = request)   requests.exceptions.ProxyError:HTTPSConnectionPool(host ='www.moma.com',port = 443):URL超过最大重试次数:/(由ProxyError('无法连接到代理服务器'引起。',RemoteDisconnected('远程终端关闭连接没有回应',)))

如果我在requests.get中删除了proxies指令,则完成所有操作。 为什么使用代理我的脚本不起作用?是在free-proxy-list中列出的代理错误还是我必须更改我的python调用? 我使用python 3.6

非常感谢 上午

1 个答案:

答案 0 :(得分:0)

阅读:https://www.scrapehero.com/how-to-rotate-proxies-and-ip-addresses-using-python-3/

然后尝试:

    import requests
    import random
    from lxml.html import fromstring

    url = 'https://free-proxy-list.net/anonymous-proxy.html'
    response = requests.get(url)
    parser = fromstring(response.text)
    proxies = []
    for i in parser.xpath('//tbody/tr')[:20]:
        if i.xpath('.//td[7][contains(text(),"yes")]'):
            proxy = ":".join([i.xpath('.//td[1]/text()')[0], i.xpath('.//td[2]/text()')[0]])

        try:
            t = requests.get("https://www.google.com/", proxies={"http": proxy, "https": proxy}, timeout=5)
            if t.status_code == requests.codes.ok:
                proxies.append(proxy)
        except:
            pass

    proxy = proxies[random.randint(0, len(proxies)-1)]

    response = requests.get('https://websiteiwhantget', proxies={"http": proxy, "https": proxy})