Python Web Scraping SSLError:握手不良

时间:2016-09-02 14:19:41

标签: python ssl beautifulsoup python-requests

我是编程新手 - 尤其是关于SSL / TSL证书的网页编写问题,所以请不要骂我啊哈!

是的,所以......

我写的脚本在我的Mac(OS X El Capitan 10.11.15)上运行得非常完美,它实际上是在一个带有登录名的网站上搜索一些数字。当我尝试在运行Windows 7 64位和Windows 10 32位的其他两台计算机上运行程序时出现问题。我对握手和SSL有基本的了解,但还不足以弄清楚这里发生了什么。

我尝试访问的网站存在一些证书问题(Chrome赢了,但是让你访问它,但是Mozilla和Safari会这样做),但是我总是通过这个问题来解决这个问题。使用request.Session()。get(....)的verify = False参数。顺便说一句,我在程序的顶部:

 with requests.Session() as c:

如果第179行有点令人困惑。

我在脚本的开头也有这个(虽然将它标记为没有区别):

requests.packages.urllib3.util.ssl_.DEFAULT_CIPHERS += ':RC4-SHA'

以下是完整的追溯:

SSLError                                  Traceback (most recent call last)
C:\Users\indy_\Desktop\bk_programs\var.py in <module>()
    226 hh_saleslist, hh_settleinvlist = hh_scraper()
    227 tillster_webvals = tillster_puller()
--> 228 sicom_report_esg, sicom_report_cor, sicom_report_nor, sicom_report_gry, sicom_report_bst = sicom_scraper()
    229 
    230 

C:\Users\indy_\Desktop\bk_programs\var.py in sicom_scraper()
    177         for i in range(len(ips)):
    178             # Retrieve the CSRF token first
--> 179             soup = BeautifulSoup(c.get(login_urllist[i], verify=False).content, 'html.parser')
    180             csrftoken = soup.find('input', dict(name='XXX_login_token'))['value']
    181 

C:\Users\indy_\AppData\Local\Enthought\Canopy32\User\lib\site-packages\requests\sessions.pyc in get(self, url, **kwargs)
    485 
    486         kwargs.setdefault('allow_redirects', True)
--> 487         return self.request('GET', url, **kwargs)
    488 
    489     def options(self, url, **kwargs):

C:\Users\indy_\AppData\Local\Enthought\Canopy32\User\lib\site-packages\requests\sessions.pyc in request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
    473         }
    474         send_kwargs.update(settings)
--> 475         resp = self.send(prep, **send_kwargs)
    476 
    477         return resp

C:\Users\indy_\AppData\Local\Enthought\Canopy32\User\lib\site-packages\requests\sessions.pyc in send(self, request, **kwargs)
    583 
    584         # Send the request
--> 585         r = adapter.send(request, **kwargs)
    586 
    587         # Total elapsed time of the request (approximately)

C:\Users\indy_\AppData\Local\Enthought\Canopy32\User\lib\site-packages\requests\adapters.pyc in send(self, request, stream, timeout, verify, cert, proxies)
    475         except (_SSLError, _HTTPError) as e:
    476             if isinstance(e, _SSLError):
--> 477                 raise SSLError(e, request=request)
    478             elif isinstance(e, ReadTimeoutError):
    479                 raise ReadTimeout(e, request=request)

SSLError: ("bad handshake: Error([('SSL routines', 'ssl3_read_bytes', 'sslv3 alert handshake failure')],)",) 

我知道还有其他类似于我的帖子,但我似乎无法开展工作。如果有人能够指出我的错误或者向我发送一些有用的文档,我真的很感激。

PS。我正在运行Python 2.7.11。

谢谢

修改 - 更多信息

相关网址为https://81.137.196.234/~sicom/mgrng/login.php

以及所有已安装模块的列表:

&#39; -registry-path == 1.0&#39;,&#39; appinst == 2.1.5&#39;,&#39; apptools == 4.4.0&#39;,&#39 ; atom == 0.3.10&#39;,&#39; backports-abc == 0.4&#39;,&#39; beautifulsoup4 == 4.4.1&#39;,&#39; beautifulsoup == 3.2.1& #39;,&#39; casuarius == 1.1&#39;,&#39; catalyst == 1.0.2&#39;,&#39; certifi == 2016.2.28&#39;,&#39; cffi == 1.7.0&#39;,&#39; chaco == 4.5.0&#39;,&#39; chardet == 2.3.0&#39;,&#39; codetools == 4.2.0&#39; ,&#39; configobj == 5.0.6&#39;,&#39; configparser == 3.5.0&#39;,&#39; cryptography == 1.5&#39;,&#39; cycler == 0.10 .0&#39;,&#39; decorator == 4.0.9&#39;,&#39; distribute-remove == 1.0.0&#39;,&#39; docutils == 0.12&#39;,& #39; enable == 4.5.1&#39;,&#39; enaml == 0.9.8&#39;,&#39; enclosure == 0.4.3&#39;,&#39; encore == 0.6。 0&#39;,&#39; enstaller == 4.8.12&#39;,&#39; entrypoints == 0.2.2&#39;,&#39; envisage == 4.5.1&#39;,&#39; ;例子== 7.3&#39;,&#39; faulthandler == 2.4&#39;,&#39; flake8 == 2.5.1&#39;,&#39;期货== 3.0.3&#39; ,&#39; html5lib == 0.999&#39;,&#39; humanize == 0.5.1&#39;,&#39; idle == 2.7.3&#39;,&#39; idna == 2.1 &#39;,&#39; ipaddress = = 1.0.16&#39;,&#39; ipykernel == 4.3.1&#39;,&#39; ipython-genutils == 0.1.0&#39;,&#39; ipython4 == 4.1.2&#39 ;,&#39; ipython == 4.0.0&#39;,&#39; ipywidgets == 5.1.5&#39;,&#39; jinja2 == 2.8&#39;,&#39; jsonschema == 2.4.0&#39;,&#39; jupyter-client == 4.2.2&#39;,&#39; jupyter-console == 4.1.1&#39;,&#39; jupyter-core == 4.1。 0&#39;,&#39; jupyter == 1.0.0&#39;,&#39; kernmagic == 0.2.0&#39;,&#39; kiwisolver == 0.1.3&#39;,&#39 ; libopenjpeg == 2.1.0&#39;,&#39; libpng == 1.6.12&#39;,&#39; libsodium == 1.0.3&#39;,&#39; lxml == 3.6.0&# 39;,&#39; markupsafe == 0.23&#39;,&#39; matplotlib == 1.5.1&#39;,&#39; mccabe == 0.3.1&#39;,&#39;记忆 - profiler == 0.41&#39;,&#39; mistune == 0.7.1&#39;,&#39; mkl-service == 1.0&#39;,&#39; mkl == 11.1.4&#39 ;,&#39; mpmath == 0.19&#39;,&#39; nbconvert == 4.2.0&#39;,&#39; nbformat == 4.0.1&#39;,&#39; ndg-httpsclient == 0.4.2&#39;,&#39; nose == 1.3.7&#39;,&#39; notebook == 4.2.1&#39;,&#39; numpy == 1.10.4&#39; ,&#39; pandas == 0.18.0&#39;,&#39; path.py == 8.1.1&#39;,&#39; pep8 == 1.7.0&#39;,&#39; pexpect == 2.4&#39;,&#39; pickleshare == 0.5& #39;,&#39; pil-remove == 1.0.0&#39;,&#39; pillow == 3.2.0&#39;,&#39; pip == 8.1.2&#39;,&# 39; plotly == 1.9.10&#39;,&#39; ply == 3.8&#39;,&#39; psutil == 3.3.0&#39;,&#39; pyaudio == 0.2.4&# 39;,&#39; pycparser == 2.14&#39;,&#39; pycrypto == 2.6.1&#39;,&#39; pyface == 5.1.0&#39;,&#39; pyflakes = = 1.1.0&#39;,&#39; pyglet == 1.1.4&#39;,&#39; pygments == 2.1.3&#39;,&#39; pyopenssl == 16.1.0&#39;, &#39; pyparsing == 2.0.3&#39;,&#39; pyreadline == 2.1&#39;,&#39; python-dateutil == 2.5.2&#39;,&#39; pytz == 2016.3&#39;,&#39; pywin32 == 220&#39;,&#39; pyzmq == 15.2.0&#39;,&#39; qtconsole == 4.2.1&#39;,&#39; scipy == 0.17.1&#39;,&#39; setuptools == 23.1.0&#39;,&#39; simplegeneric == 0.8.1&#39;,&#39; singledispatch == 3.4.0.3&# 39;,&#39;六== 1.10.0&#39;,&#39; ssl-match-hostname == 3.4.0.2&#39;,&#39; sympy == 1.0&#39;,& #39;龙卷风== 4.3&#39;,&#39; traitlets == 4.2.1&#39;,&#39; traits-enaml == 0.2.1&#39;,&#39; traits == 4.5 .0&#39;,&#39; traitsui == 5.1.0&#39;,&#39; urllib3 == 1.16&#39;,&#39; wxpython == 3.0.2.0&#39;,&# 39; xlwt == 1.1.2&#39;

>>> ssl.OPENSSL_VERSION
'OpenSSL 1.0.2g  1 Mar 2016'

继Steffen评论之后的第二次编辑

import requests
requests.packages.urllib3.util.ssl_.DEFAULT_CIPHERS += ':RC4-SHA' 
with requests.Session() as c:

    url = "https://81.137.196.234/~sicom/mgrng/login.php"
    page = c.get(url, verify=False)
    print page.content

并且追溯是:

SSLError                                  Traceback (most recent call last)
c:\users\indy_\appdata\local\temp\tmpkuleyd.py in <module>()
      6 
      7     url = "https://81.137.196.234/~sicom/mgrng/login.php"
----> 8     page = c.get(url, verify=False)
      9     print page.content
     10 

C:\Users\indy_\AppData\Local\Enthought\Canopy32\User\lib\site-packages\requests\sessions.pyc in get(self, url, **kwargs)
    485 
    486         kwargs.setdefault('allow_redirects', True)
--> 487         return self.request('GET', url, **kwargs)
    488 
    489     def options(self, url, **kwargs):

C:\Users\indy_\AppData\Local\Enthought\Canopy32\User\lib\site-packages\requests\sessions.pyc in request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
    473         }
    474         send_kwargs.update(settings)
--> 475         resp = self.send(prep, **send_kwargs)
    476 
    477         return resp

C:\Users\indy_\AppData\Local\Enthought\Canopy32\User\lib\site-packages\requests\sessions.pyc in send(self, request, **kwargs)
    583 
    584         # Send the request
--> 585         r = adapter.send(request, **kwargs)
    586 
    587         # Total elapsed time of the request (approximately)

C:\Users\indy_\AppData\Local\Enthought\Canopy32\User\lib\site-packages\requests\adapters.pyc in send(self, request, stream, timeout, verify, cert, proxies)
    475         except (_SSLError, _HTTPError) as e:
    476             if isinstance(e, _SSLError):
--> 477                 raise SSLError(e, request=request)
    478             elif isinstance(e, ReadTimeoutError):
    479                 raise ReadTimeout(e, request=request)

SSLError: ("bad handshake: Error([('SSL routines', 'ssl3_read_bytes', 'sslv3 alert handshake failure')],)",) 

我不需要通过并实际验证证书 - 我知道它是安全的。我实际上只是想绕过认证检查,这是我认为在请求.get()中的verify = False参数,但显然它被忽略了。

what chrome says

解决

完全卸载了Python和所有额外的模块。

重新安装Python 2.7.11。现在它吐了很多

C:\Users\indy_\AppData\Local\Enthought\Canopy32\User\lib\site-packages\requests\packages\urllib3\connectionpool.py:821: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.org/en/latest/security.html

InsecureRequestWarning)

错误,但谢天谢地。

0 个答案:

没有答案