Python 3.5 Beautiful soup 4错误UserWarning:没有明确指定解析器

时间:2016-06-07 03:10:21

标签: web-scraping beautifulsoup python-3.5

我尝试使用BeautifulSoup 4.成功安装后,总会出现一些错误,我无法修复它,因为#34;汤= BeautifulSoup(html)"

当我使用以下代码时:

from bs4 import BeautifulSoup  
soup = BeautifulSoup(html)

显示错误:

//anaconda/lib/python3.5/site-packages/bs4/__init__.py:166: UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("lxml"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

To get rid of this warning, change this:

 BeautifulSoup([your markup])

to this:

  BeautifulSoup([your markup], "lxml")

  markup_type=markup_type))
Traceback (most recent call last):

   File "<ipython-input-13-d4b16f497b1d>", line 1, in <module>
runfile('/Users/beckswu/Desktop/coursera/using python access web data/class 2.py', wdir='/Users/beckswu/Desktop/coursera/using python access web data')

   File "//anaconda/lib/python3.5/site-packages/spyderlib/widgets/externalshell/sitecustomize.py", line 699, in runfile
execfile(filename, namespace)

   File "//anaconda/lib/python3.5/site-packages/spyderlib/widgets/externalshell/sitecustomize.py", line 88, in execfile
exec(compile(open(filename, 'rb').read(), filename, 'exec'), namespace)

   File "/Users/beckswu/Desktop/coursera/using python access web data/class 2.py", line 37, in <module>
soup = BeautifulSoup(html)

   File "//anaconda/lib/python3.5/site-packages/bs4/__init__.py", line 212, in __init__
markup, from_encoding, exclude_encodings=exclude_encodings)):

   File "//anaconda/lib/python3.5/site-packages/bs4/builder/_lxml.py", line 108, in prepare_markup
markup, try_encodings, is_html, exclude_encodings)

TypeError: __init__() takes from 2 to 4 positional arguments but 5 were given

然后我将代码更改为

from bs4 import BeautifulSoup  
soup = BeautifulSoup(html,"lxml")
markup_type=markup_type))

它还显示错误

    markup_type=markup_type))
                       ^
SyntaxError: invalid syntax

我该如何解决?我感谢任何人的帮助。

2 个答案:

答案 0 :(得分:0)

我相信您的代码中有错误:

  
from bs4 import BeautifulSoup 
# if you decide to use html as parser 
soup = BeautifulSoup("html", features="html.parser") 

## the third parameter is the **builder** and it defaults to None, so you dont have to add it. Actually it is not **markup_type**

如果没有lxml,可以通过运行以下命令进行安装:

pip install lxml 

然后您将其导入并像这样使用:

 
from bs4 import BeautifulSoup
import lxml
soup = BeautifulSoup("html", "lxml")

BeautifulSoup构造函数的参数为​​:

markup =“”,功能=无,构建器=无,parse_only =无,from_encoding =无,exclude_encodings =无和**扭曲。

答案 1 :(得分:-1)

而不是html,你需要传递html的文本文件,如下所示

from bs4 import BeautifulSoup
request = requests.get("http://www.flipkart.com/search").text
soup = BeautifulSoup(request)

希望这有帮助:)