HTTPS网站的Python urllib错误

时间:2015-07-31 14:02:35

标签: python python-3.x urllib

我在Windows 7上使用Python 3.4,我试图通过使用python脚本来测试代理是否允许或拒绝与特定网站的连接。

我正在使用以下代码:

from urllib.request import Request, urlopen

from urllib.error import URLError, HTTPError, urllib

conf = "http://{}:{}@{}".format(login, password, proxy)

supp = urllib.request.ProxyHandler({"http": conf})

auth = urllib.request.HTTPBasicAuthHandler()

open = urllib.request.build_opener(supp, auth, urllib.request.HTTPHandler)

urllib.request.install_opener(open)

response = urlopen(Request("http://www.google.com"))

执行上面的代码时没有错误,但是只要我将URL切换到HTTPS(例如,https://www.google.com),我就会收到以下错误:

C:\Python34\python.exe test_url.py
Traceback (most recent call last):
  File "C:\Python34\lib\urllib\request.py", line 1182, in do_open
    h.request(req.get_method(), req.selector, req.data, headers)
  File "C:\Python34\lib\http\client.py", line 1088, in request
    self._send_request(method, url, body, headers)
  File "C:\Python34\lib\http\client.py", line 1126, in _send_request
    self.endheaders(body)
  File "C:\Python34\lib\http\client.py", line 1084, in endheaders
    self._send_output(message_body)
  File "C:\Python34\lib\http\client.py", line 922, in _send_output
    self.send(msg)
  File "C:\Python34\lib\http\client.py", line 857, in send
    self.connect()
  File "C:\Python34\lib\http\client.py", line 1223, in connect
    super().connect()
  File "C:\Python34\lib\http\client.py", line 834, in connect
    self.timeout, self.source_address)
  File "C:\Python34\lib\socket.py", line 494, in create_connection
    for res in getaddrinfo(host, port, 0, SOCK_STREAM):
  File "C:\Python34\lib\socket.py", line 533, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 11004] getaddrinfo failed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "test.py", line 14, in <module>
    response = urlopen(Request("https://www.google.com"))
  File "C:\Python34\lib\urllib\request.py", line 161, in urlopen
    return opener.open(url, data, timeout)
  File "C:\Python34\lib\urllib\request.py", line 463, in open
    response = self._open(req, data)
  File "C:\Python34\lib\urllib\request.py", line 481, in _open
    '_open', req)
  File "C:\Python34\lib\urllib\request.py", line 441, in _call_chain
    result = func(*args)
  File "C:\Python34\lib\urllib\request.py", line 1225, in https_open
    context=self._context, check_hostname=self._check_hostname)
  File "C:\Python34\lib\urllib\request.py", line 1184, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [Errno 11004] getaddrinfo failed>

为什么我的代码只与HTTP网站合作?

1 个答案:

答案 0 :(得分:0)

您需要单独指定HTTPS的代理处理程序,因为它是与HTTP不同的协议。因此,ProxyHandler行应更改为:

supp = urllib.request.ProxyHandler({"http": conf, "https": conf})