Question

我正试图从Billboard 100强中获得歌曲的标题。图片是他们的html脚本。

我写了这段代码：

from bs4 import BeautifulSoup
import urllib.request

url= 'http://www.billboard.com/charts/year-end/2015/hot-100-songs'
page = urllib.request.urlopen(url)
soup = BeautifulSoup(page.read(), "html.parser")
songtitle = soup.find("div", {"class": "row-title"}).h2.contents
print(songtitle)

它检索第一个标题“UPTOWN FUNK！” 当我使用find_all时，它会给我错误：

line 6, in <module>
songtitle = soup.find_all("div", {"class": "row-title"}).h2.contents
AttributeError: 'ResultSet' object has no attribute 'h2'

为什么它会给我一个错误，而不是给我所有的标题？可以使用此站点上的chrome中的Control Shift J找到完整的html脚本：http://www.billboard.com/charts/year-end/2015/hot-100-songs

Answer 1

.find_all()返回一个ResultSet对象，该对象基本上是Tag个实例的列表 - 它没有find()方法。您需要遍历find_all()的结果并在每个代码上调用find()：

for item in soup.find_all("div", {"class": "row-title"}):
    songtitle = item.h2.contents
    print(songtitle)

或者，制作一个CSS selector：

for title in soup.select("div.row-title h2"):
    print(title.get_text())

顺便说一下，这个问题是covered in the documentation：

AttributeError: 'ResultSet' object has no attribute 'foo' - 这个通常是因为您希望find_all()返回单个标记或字符串。但find_all()会返回标记和字符串-a的列表 ResultSet对象。你需要迭代列表并查看每个.foo。或者，如果您真的只想要一个结果，那么您需要使用find()代替find_all()。

Answer 2

find_all始终返回一个列表。你可以做列表操作。

例如，

songtitle = soup.find_all("div", {"class": "row-title"})[0].get
print songtitle.get('h2')
songtitle = soup.find_all("div", {"class": "row-title"})[1].get
print songtitle.get('h2')

输出：

UPTOWN FUNK!
THINKING OUT LOUD

for item in soup.find_all("div", {"class": "row-title"}):
    songtitle=item.get('h2')
    print songtitle

为什么find_all会出现错误，即使找不到错误？（Python美丽的汤）

2 个答案:

为什么find_all会出现错误，即使找不到错误？ （Python美丽的汤）

2 个答案:

为什么find_all会出现错误，即使找不到错误？（Python美丽的汤）