Question

我的代码只使用BeautifulSoup解析器从IMDB中获取热门电影网站。

代码如下：

from bs4 import BeautifulSoup
import requests
import sys
url = "http://www.imdb.com/chart"
response = requests.get(url)
soup = BeautifulSoup(response.text)
tr = soup.findChildren("tr")
tr = iter(tr)
next(tr)
for movie in tr:
    title = movie.find('td',{'class':'titleColumn'}).find('a').contents[0]
    year = movie.find('td',{'class':'titleColumn'}).find('span',{'class':'secondaryInfo'}).contents[0]
    rating = movie.find('td',{'class':'ratingColumn imdbRating'}).find('strong').contents[0]
    row = title + "-" + year + " " + " " + rating
    print(row)

它出现以下错误：

Traceback (most recent call last):
  File "imdb_scrap.py", line 12, in <module>
    year = movie.find('td',{'class':'titleColumn'}).find('span',{'class':'secondaryInfo'}).contents[0]
AttributeError: 'NoneType' object has no attribute 'contents'

给定的代码第一次运行时输出不完整，而错误在以后根本没有运行。

示例输出：

Stalker-(1979)  8.1
Paper Moon-(1973)  8.1
The Maltese Falcon-(1941)  8.1
The Truman Show-(1998)  8.1
Hachi: A Dog's Tale-(2009)  8.1
Le notti di Cabiria-(1957)  8.1
The Princess Bride-(1987)  8.1
Kaze no tani no Naushika-(1984)  8.1
Munna Bhai M.B.B.S.-(2003)  8.1
Before Sunrise-(1995)  8.1
Harry Potter and the Deathly Hallows: Part 2-(2011)  8.0
The Grapes of Wrath-(1940)  8.0
Prisoners-(2013)  8.0
Rocky-(1976)  8.0
Star Wars: Episode VII - The Force Awakens-(2015)  8.0
Touch of Evil-(1958)  8.0
Sholay-(1975)  8.0
Catch Me If You Can-(2002)  8.0
Gandhi-(1982)  8.0

任何帮助或建议将不胜感激。

Answer 1

查看代码，错误在于

.find('span',{'class':'secondaryInfo'}).contents[0]

这可能不是唯一的错误，但这是第一个看到它给你的错误消息的错误。我也知道变量年的前半部分是正确的，因为它与变量标题相同。考虑到这一点，可能是因为你假设年份位于[0]，但这显然是错误的，因为它告诉你位置[0]根本没有信息，所以如果你要将它打印为

print(movie.find('td',{'class':'titleColumn'}).find('span',{'class':'secondaryInfo'}).contents[0])

我保证你会得到None作为答案，那就是没有属性的NoneType。

AttributeError：'NoneType'对象没有属性'contents'

1 个答案: