我从视频教程中写了一个简单的脚本:
import bs4 as bs
import urllib.request
source = urllib.request.urlopen('https://pythonprogramming.net/parsememcparseface/').read()
soup = bs.BeautifulSoup(source, 'lxml')
print(source)
当我运行程序时它会返回此错误:
Traceback (most recent call last):
File "/Users/UntouchedDruid4/Projects/Web_Scrapper/app.py", line 4, in <module>
source = urllib.request.urlopen('https://pythonprogramming.net/parsememcparseface/').read()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 223, in urlopen
return opener.open(url, data, timeout)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 526, in open
response = self._open(req, data)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 544, in _open
'_open', req)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 504, in _call_chain
result = func(*args)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 1361, in https_open
context=self._context, check_hostname=self._check_hostname)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 1320, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:777)>
我不知道这意味着什么。请帮忙。
答案 0 :(得分:0)
使用urllib2或请求和抓取使用re.search或BeautifulSoup As Your Want
import urllib2
from bs4 import BeautifulSoup
import re
read = urllib2.urlopen('https://pythonprogramming.net/parsememcparseface/').read()
使用RE.SEARCH
f = re.search(r'<title>(.*)</title>', read)
title = f.group(1)
print " Title Of the Site Is : " + title
使用BeautifulSoup
soup = BeautifulSoup(read, 'html.parser')
print soup.title ## Example For Title
这只是标题的一个例子