无法在python2.7上使用urllib访问https站点

时间:2013-10-31 19:02:31

标签: python ssl https urllib

eduardo@camizao:/$ python2.7 
Python 2.7.3 (default, Sep 26 2013, 20:03:06) 
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib
>>> url1 = 'http://www.google.com'
>>> url2 = 'https://www.google.com'
>>> f = urllib.urlopen(url1) 
>>> f = urllib.urlopen(url2)
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "/usr/lib/python2.7/urllib.py", line 87, in urlopen
  return opener.open(url)
 File "/usr/lib/python2.7/urllib.py", line 211, in open
  return getattr(self, name)(url)
 File "/usr/lib/python2.7/urllib.py", line 355, in open_http
  'got a bad status line', None)
IOError: ('http protocol error', 0, 'got a bad status line', None)
>>> 

当我尝试连接到https站点时,使用urllib我得到了上面的错误。 代理正确设置。调试python代码,我注意到在urllib.py中没有执行ssl库的导入。因此,也不会执行https调用。有人可以帮帮我吗?我必须使用urllib,而不是urllib2或其他。提前致谢。

1 个答案:

答案 0 :(得分:0)

至少你写作的方式没有错:

$ python
Python 2.7.4 (default, Sep 26 2013, 03:20:26) 
[GCC 4.7.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib
>>> url1 = 'http://www.google.com'
>>> url2 = 'https://www.google.com'
>>> f = urllib.urlopen(url1)
>>> f = urllib.urlopen(url2)
>>> f.read()[:15]
'<!doctype html>'
>>>

所以这就是它不是。它必须与您的环境或配置相关。你说你在使用代理?

修改

我可以通过开放代理打开它(不会包含所谓的代理,因为谁知道它是否粗略 - 用你自己的代理代替:

$ python
Python 2.7.4 (default, Sep 26 2013, 03:20:26) 
[GCC 4.7.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib2
>>> proxy_handler = urllib2.ProxyHandler({'http': 'http://some-sketchy-open-proxy'})
>>> opener = urllib2.build_opener(proxy_handler)
>>> opener.open('https://www.google.com')
<addinfourl at 140512985881056 whose fp = <socket._fileobject object at 0x7fcbba9b1ed0>>
>>> _.read()[:15]
'<!doctype html>'
>>> 

尝试使用您自己的代理网址(注意我使用了urllib2,而不是urllib)。希望有所帮助!

编辑2

仅使用urllib:

$ python
Python 2.7.4 (default, Sep 26 2013, 03:20:26) 
[GCC 4.7.3] on linux2
Type "copyright", "credits" or "license()" for more information.
>>> import urllib
>>> proxies = {'http': '189.112.3.87:3128'}
>>> url = 'https://www.google.com'
>>> filehandle = urllib.urlopen(url,proxies=proxies)
>>> filehandle.read()[:15]
'<!doctype html>'
>>>