我试图以这种方式访问网站:
from urllib.request import urlopen
from bs4 import BeautifulSoup
from get_info import get_info
from get_next import get_next
html = urlopen(url)
我得到的是:
Traceback (most recent call last):
File "D:\scraping\loverepublic_scraper\get_pages.py", line 27, in <module>
info.append(get_info(link))
File "D:\scraping\loverepublic_scraper\get_info.py", line 8, in get_info
html = urlopen(url)
File "C:\Users\taras\AppData\Local\Programs\Python\Python36\lib\urllib\request.py", line 223, in urlopen
return opener.open(url, data, timeout)
File "C:\Users\taras\AppData\Local\Programs\Python\Python36\lib\urllib\request.py", line 526, in open
response = self._open(req, data)
File "C:\Users\taras\AppData\Local\Programs\Python\Python36\lib\urllib\request.py", line 544, in _open
'_open', req)
File "C:\Users\taras\AppData\Local\Programs\Python\Python36\lib\urllib\request.py", line 504, in _call_chain
result = func(*args)
File "C:\Users\taras\AppData\Local\Programs\Python\Python36\lib\urllib\request.py", line 1361, in https_open
context=self._context, check_hostname=self._check_hostname)
File "C:\Users\taras\AppData\Local\Programs\Python\Python36\lib\urllib\request.py", line 1320, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:833)>
在另一个SO问题中,我看到了一个可能导入ssl包的解决方案
import ssl
创建上下文,并将verify参数设置为false:
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE
然后将该对象传递给urlopen函数:
html = urlopen(url, context=ctx)
但是,我仍然会遇到同样的错误。我怎么能绕过它?