我有一些代码可以像这样从互联网上获取一些资源:
def ScrapeFromUrl(url):
with urllib.request.urlopen(url) as response:
html = response.read()
urlToFile('main', html.decode('utf-8'))
webSoup = BeautifulSoup(html, 'html.parser')
mainContent = webSoup.find("div", { "id" : "main" })
generalImdbData['nextPageUrl'] = mainContent.findChildren()[0].findChildren()[2].find('a').get('href')
generalImdbData['totalResults'] = int(re.search( r'(\d+)(?!.*\d)', mainContent.findChildren()[0].findChildren()[2].span.contents[0]).group(1))
generalImdbData['loadedResults'] = int(re.search( r'\-(\d+)', mainContent.findChildren()[0].findChildren()[2].span.contents[0]).group(1))
actorsContainer = mainContent.findAll("div", {"class": "lister-list"})[0]
for actor in actorsContainer.findAll("div", {"class": "lister-item"}):
SearchResultsToActorObjects(actor)
urlToFile('data', str(mainContent))
GoToNextPageUrl(generalImdbData['loadedResults'], generalImdbData['nextPageUrl'])
def GoToNextPageUrl(loadedResultsCount, nextUrl):
if loadedResultsCount >= generalImdbData['totalResults']:
for a in actorObjectList:
a.printActor()
a.insertIntoDB()
actorObjectList.clear()
else:
for a in actorObjectList:
a.printActor()
a.insertIntoDB()
actorObjectList.clear()
ScrapeFromUrl(generalImdbData['baseUrl'] + nextUrl)
函数将被这样调用:
ScrapeFromUrl(generalImdbData['originalSearchUrl'])
但是我遇到的问题是这些函数被调用了大约5万次。所以我得到这个递归限制错误。
如何防止这种情况发生?
答案 0 :(得分:0)
ScrapeFromUrl()
调用GoToNextPageUrl()
,调用ScrapeFromUrl()
,调用GoToNextPageUrl()
等,等等。
这将创建无限递归。
您需要重新组织代码,以使函数不会无休止地相互调用。