我目前正试图取消ATP(网球协会)网站,我遇到了一个我无法解决的问题。
当我尝试废弃位于第2700行之后的行时,我收到错误。
有没有办法解决这个问题?
这是我的代码(此代码适用于前面的代码):
# -*- coding: utf-8 -*-
from bs4 import BeautifulSoup
from urllib2 import urlopen
import sys
BASE_URL = "http://www.atpworldtour.com/Share/Event-Draws.aspx?e=540&y=2012"
def make_soup(url):
html = urlopen(url).read()
return BeautifulSoup(html, "lxml")
def get_player_name_third_round_winner(section_url):
soup = make_soup(section_url)
colonne4 = soup.find("td", "col_4")
playerWrap = colonne4.findAll("div", "playerWrap")
for name in playerWrap:
print name.find("a").string
def get_player_score_third_round_winner(section_url):
soup = make_soup(section_url)
colonne4 = soup.find("td", "col_4")
scores = colonne4.findAll("div", "scores")
for score in scores:
print score.find("a").string
get_player_name_third_round_winner(BASE_URL)
get_player_score_third_round_winner(BASE_URL)
以下是显示的错误:
Traceback (most recent call last):
File "/Users/Me/Desktop/ATP/atp_col4", line 27, in <module>
get_player_name_third_round_winner(BASE_URL)
File "/Users/Me/Desktop/ATP/atp_col4", line 16, in get_player_name_third_round_winner
playerWrap = colonne4.findAll("div", "playerWrap")
AttributeError: 'NoneType' object has no attribute 'findAll'
[Finished in 1.6s with exit code 1]
答案 0 :(得分:0)
好吧,我和你有同样的错误。但我会把它打印成结果。我不知道这是否是最好的解决方案,但至少它是一个。
我将代码更改为此代码:
def make_soup(url):
html = urlopen(url).read()
return BeautifulSoup(html, "html.parser")
然后我也包括这部分:
import sys
sys.setrecursionlimit(30000)