Question

我尝试使用BeautifulSoup 4.成功安装后，总会出现一些错误，我无法修复它，因为＃34;汤= BeautifulSoup（html）＆＃34;

当我使用以下代码时：

from bs4 import BeautifulSoup  
soup = BeautifulSoup(html)

显示错误：

//anaconda/lib/python3.5/site-packages/bs4/__init__.py:166: UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("lxml"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

To get rid of this warning, change this:

 BeautifulSoup([your markup])

to this:

  BeautifulSoup([your markup], "lxml")

  markup_type=markup_type))
Traceback (most recent call last):

   File "<ipython-input-13-d4b16f497b1d>", line 1, in <module>
runfile('/Users/beckswu/Desktop/coursera/using python access web data/class 2.py', wdir='/Users/beckswu/Desktop/coursera/using python access web data')

   File "//anaconda/lib/python3.5/site-packages/spyderlib/widgets/externalshell/sitecustomize.py", line 699, in runfile
execfile(filename, namespace)

   File "//anaconda/lib/python3.5/site-packages/spyderlib/widgets/externalshell/sitecustomize.py", line 88, in execfile
exec(compile(open(filename, 'rb').read(), filename, 'exec'), namespace)

   File "/Users/beckswu/Desktop/coursera/using python access web data/class 2.py", line 37, in <module>
soup = BeautifulSoup(html)

   File "//anaconda/lib/python3.5/site-packages/bs4/__init__.py", line 212, in __init__
markup, from_encoding, exclude_encodings=exclude_encodings)):

   File "//anaconda/lib/python3.5/site-packages/bs4/builder/_lxml.py", line 108, in prepare_markup
markup, try_encodings, is_html, exclude_encodings)

TypeError: __init__() takes from 2 to 4 positional arguments but 5 were given

然后我将代码更改为

from bs4 import BeautifulSoup  
soup = BeautifulSoup(html,"lxml")
markup_type=markup_type))

它还显示错误

    markup_type=markup_type))
                       ^
SyntaxError: invalid syntax

我该如何解决？我感谢任何人的帮助。

Answer 1

我相信您的代码中有错误：

from bs4 import BeautifulSoup 
# if you decide to use html as parser 
soup = BeautifulSoup("html", features="html.parser") 

## the third parameter is the **builder** and it defaults to None, so you dont have to add it. Actually it is not **markup_type**

如果没有lxml，可以通过运行以下命令进行安装：

pip install lxml

然后您将其导入并像这样使用：

from bs4 import BeautifulSoup
import lxml
soup = BeautifulSoup("html", "lxml")

BeautifulSoup构造函数的参数为：

markup =“”，功能=无，构建器=无，parse_only =无，from_encoding =无，exclude_encodings =无和**扭曲。

Answer 2

而不是html，你需要传递html的文本文件，如下所示

from bs4 import BeautifulSoup
request = requests.get("http://www.flipkart.com/search").text
soup = BeautifulSoup(request)

希望这有帮助:)

Python 3.5 Beautiful soup 4错误UserWarning：没有明确指定解析器

2 个答案: