我正在尝试从IMDB网页获取链接。内部表有链接但我得到这个错误我不知道如何获取链接我是初学者PLZ帮助
from bs4 import BeautifulSoup
import urllib2
var_file = urllib2.urlopen("http://www.imdb.com/chart/top")
var_html = var_file.read()
var_file.close()
soup = BeautifulSoup(var_html)
for item in soup.find_all(tbody={'class': 'lister-list'}):
for link in item.find_all('a'):
print(link.get('href'))
我收到此错误
C:\Python27\lib\site-packages\bs4\__init__.py:166: UserWarning: No parser was ex
plicitly specified, so I'm using the best available HTML parser for this system
("lxml"). This usually isn't a problem, but if you run this code on another syst
em, or in a different virtual environment, it may use a different parser and beh
ave differently.
To get rid of this warning, change this:
BeautifulSoup([your markup])
to this:
BeautifulSoup([your markup], "lxml")
markup_type=markup_type))
答案 0 :(得分:1)
这只是一个警告,说你没有选择解析器......
而不是
soup = BeautifulSoup(var_html)
尝试:
soup = BeautifulSoup(var_html, "lxml")
答案 1 :(得分:0)
使用
soup.find_all(class_='lister-list')