Question

安装BeautifulSoup之后，每当我在cmd中运行我的Python时，都会发出此警告。

D:\Application\python\lib\site-packages\beautifulsoup4-4.4.1-py3.4.egg\bs4\__init__.py:166:
UserWarning: No parser was explicitly specified, so I'm using the best
available HTML parser for this system ("html.parser"). This usually isn't a
problem, but if you run this code on another system, or in a different
virtual environment, it may use a different parser and behave differently.

To get rid of this warning, change this:

 BeautifulSoup([your markup])

to this:

 BeautifulSoup([your markup], "html.parser")

我没有理解为什么会出来以及如何解决它。

Answer 1

错误消息中明确说明了您的问题的解决方案。像下面这样的代码没有指定XML / HTML / etc.解析器。

BeautifulSoup( ... )

为了解决错误，您需要指定您要使用的解析器，例如：

BeautifulSoup( ..., "html.parser" )

如果您愿意，也可以安装第三方解析器。

Answer 2

文档建议您安装并使用lxml来提高速度。

BeautifulSoup(html, "lxml")

如果您使用的是早于2.7.3的Python 2版本或版本在3.2.2之前的Python 3中，安装lxml至关重要或html5lib-Python的内置HTML解析器不是很好旧版本。

安装LXML解析器

关于Ubuntu（debian）
```
apt-get install python-lxml 
```
Fedora（基于RHEL）
```
dnf install python-lxml
```
使用PIP
```
pip install lxml
```

Answer 3

对于HTML解析器，您需要安装html5lib，运行：

pip install html5lib

然后在BeautifulSoup方法中添加html5lib：

htmlDoc = bs4.BeautifulSoup(req1.text, 'html5lib')
print(htmlDoc)

Answer 4

我认为以前的帖子没有回答这个问题。

是的，正如大家所说，您可以通过指定解析器来删除警告。
正如文档所指出的那样，这是表现¹和保持一致性²的最佳实践。

但是在某些情况下，您想使警告保持沉默...因此这篇文章。

自BeautifulSoup 4 rev 460起，警告消息不会以交互（REPL）模式出现
在How to disable python warnings上有更多通用的答案可以控制Python警告（TL； DL：PYTHONWARNINGS=ignore或-Wignore）

通过添加到代码中来显式消除警告（bs4≥rev 569）：

import warnings
warnings.filterwarnings('ignore', category=GuessedAtParserWarning)

bs4.BeautifulSoup(
  your_markup,
  builder=bs4.builder_registry.lookup(*bs4.BeautifulSoup.DEFAULT_BUILDER_FEATURES)
)

如何摆脱BeautifulSoup用户警告？

4 个答案: