我尝试在我的python3.5上使用soup4,但是每当我统治一个代码以从Internet提取某些东西时,我都会收到此错误:
- s4\__init__.py", line 198, in __init__
% ",".join(features)) bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: html5lib. Do you need to install a parser library?
此网站bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?中的链接存在相同的错误 我尝试了所有,仍然收到错误
所有pip安装请求 pip安装lxml pip安装beautifull soup4
我下载了soup4 https://www.crummy.com/software/BeautifulSoup/bs4/download/4.6/手册,进行安装setup.py install
我已经全部更新并且可以正常工作,但是仍然出现错误,请帮助我
答案 0 :(得分:7)
如果您使用html5lib
作为基础解析器:
soup = BeautifulSoup(html, "html5lib")
# ^HERE^
然后,您需要在Python环境中安装html5lib
模块:
pip install html5lib
文档参考:Installing a parser。
答案 1 :(得分:0)
对于即使安装了html5lib也会出现相同错误的用户,请按照https://github.com/coursera-dl/edx-dl/issues/434的建议将“ html5par”替换为“ html.parser”
为我工作:)
答案 2 :(得分:0)
对我来说html.parser工作
from bs4 import BeautifulSoup
import urllib.request
response = urllib.request.urlopen('http://php.net/')
html = response.read()
soup = BeautifulSoup(html,"html.parser")
text = soup.get_text(strip=True)
print (text)
答案 3 :(得分:0)
使用“ html.parser ”而不是“ html5lib ”。这会起作用。