在python 2.7脚本中,我使用的是BeautifulSoup:
from bs4 import BeautifulSoup
url_page_accueil_pluzz = "http://pluzz.francetv.fr/"
f = urllib.urlopen(url_page_accueil_pluzz)
page_accueil_pluzz = f.read()
f.close()
soup = str(BeautifulSoup(page_accueil_pluzz, "html.parser"))
root = fromstring(soup)
new_soup = (tostring(root, pretty_print=True).strip())
new_soup = BeautifulSoup(new_soup, "html.parser")
我无法避免这条消息:
/usr/lib/python2.7/dist-packages/bs4/__init__.py:166: UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("html.parser"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.
To get rid of this warning, change this:
BeautifulSoup([your markup])
to this:
BeautifulSoup([your markup], "html.parser")
markup_type=markup_type))
我在Kubuntu 16.04(x64)
python-bs4版本4.4.1-1
Python v.2.7.12
添加" html.parser"对我的剧本似乎没什么影响。有人有想法吗?
$ cat dPluzz.py | grep "BeautifulSoup("
soup = BeautifulSoup(page_video, "html.parser")
soup = BeautifulSoup(self.page_pluzz, "html.parser")
soup = BeautifulSoup(self.feuille_canal, "html.parser")
soup = BeautifulSoup(self.sous_feuille_canal, "html.parser")
soup = str(BeautifulSoup(page_accueil_pluzz, "html.parser"))
new_soup = BeautifulSoup(new_soup, "html.parser")