嗨,我正在尝试使用beautifulsoup和requests为myanimelist的顶级漫画部分(https://myanimelist.net/topmanga.php)制作一个网络抓取工具。 我的问题是我收到此错误代码:
Traceback (most recent call last):
File "C:/Kaan/Proje/Python/Programlar/Manga/manga2.py", line 119, in <module>
mangainfo(new_url)
File "C:/Kaan/Proje/Python/Programlar/Manga/manga2.py", line 29, in mangainfo
names.append(name.text)
AttributeError: 'NoneType' object has no attribute 'text'
我了解此错误代码,但我不了解的是我是随机得到的。可以说我在20号漫画上遇到此错误。如果我再次启动该程序,则代码可以正常工作直到200号漫画,然后给出错误。在编写此代码时,我再次运行该程序,并在漫画140处出现错误。
我该如何解决这个问题?是因为我的编码而发生,还是因为网站而发生?
import requests
from bs4 import Tag, BeautifulSoup
time import time
url = "https://myanimelist.net/topmanga.php"
def mangainfo(url):
manga_id, scores, manga_genre, names, authors = list(), list(), list(), list(), list()
r = requests.get(url)
c = r.content
soup2 = BeautifulSoup(c, "html.parser")
# Manga Links
manga_links = soup2.find_all("a", class_="hoverinfo_trigger fs14 fw-b")
count = 0
for link in manga_links:
start = time()
r = requests.get(link["href"])
c = r.content
soup = BeautifulSoup(c, "html.parser")
# Names
name = soup.find("h1", class_="h1")
names.append(name.text)
# Scores
score = soup.find("div", class_="fl-l score")
scores.append(str(score.text.strip()))
# Manga ID
for x in link["href"].split("/"):
if x.isdigit():
manga_id.append(int(x))
break
# Manga Genres
genre = soup.find("span", text="Genres:")
manga_genre.append([x.text for x in genre.next_siblings if isinstance(x, Tag)])
# Authors
author = soup.find("span", text="Authors:")
authors.append([x.text for x in author.next_siblings if isinstance(x, Tag)])
stop = time()
count += 1
print("{} - Time: {:.2f} Link: {} ==> OK".format(count, stop - start, link["href"]))
for i in range(0, 46651, 50): # 46651 limit
new_url = url + "?limit=" + str(i)
print("Sayfa: {} ===> {}".format(str(i), new_url))
start_time = time()
mangainfo(new_url)
stop_time = time()
print("MangaInfo Time: {:.2f}".format(stop_time - start_time))