因此,此代码可以获得所有比赛结果,团队与队伍以及比赛得分。例如像http://www.gosugamers.net/counterstrike/teams/7395-mousesports-cs/matches这样的团队。但是这段代码只获得了第一页的结果,我试图获得每个页面的所有结果。问题是有些团队没有下一页按钮,所以当我尝试实现该代码时程序崩溃了。我如何编写代码以获取下一页并继续获得结果,如果团队匹配链接没有下一页只是继续?
def all_match_outcomes():
for match_outcomes in match_history_url():
rest_server(True)
page = requests.get(match_outcomes).content
soup = BeautifulSoup(page, 'html.parser')
team_name_element = soup.select_one('div.teamNameHolder')
team_name = team_name_element.find('h1').text.replace('- Team Overview', '')
for match_outcome in soup.select('table.simple.gamelist.profilelist tr'):
opp1 = match_outcome.find('span', {'class': 'opp1'}).text
opp2 = match_outcome.find('span', {'class': 'opp2'}).text
opp1_score = match_outcome.find('span', {'class': 'hscore'}).text
opp2_score = match_outcome.find('span', {'class': 'ascore'}).text
if match_outcome(True): # If teams have past matches
print(team_name, '%s %s:%s %s' % (opp1, opp1_score, opp2_score, opp2))
答案 0 :(得分:0)
在将游戏分数拉出桌面的for
循环之后,你可以抓住分页链接。
使用此代码,您可以通过查找当前选定的页面来获取下一页。如果没有超出当前选定页面的页面(当前)将打印“未找到页面”。
paginate = soup.find('div', {'class':'paginator'})
page = paginate.find('a', {'class':'selected'})
next_page = page.find_next_sibling()
if next_page:
print(next_page.get('href'))
else:
print("no page found")
修改强>
回应评论;这就是我想用这段代码的方法。然后它将被添加,你可以继续循环。
def all_match_outcomes():
for match_outcomes in match_history_url():
rest_server(True)
page = requests.get(match_outcomes).content
soup = BeautifulSoup(page, 'html.parser')
team_name_element = soup.select_one('div.teamNameHolder')
team_name = team_name_element.find('h1').text.replace('- Team Overview', '')
for match_outcome in soup.select('table.simple.gamelist.profilelist tr'):
opp1 = match_outcome.find('span', {'class': 'opp1'}).text
opp2 = match_outcome.find('span', {'class': 'opp2'}).text
opp1_score = match_outcome.find('span', {'class': 'hscore'}).text
opp2_score = match_outcome.find('span', {'class': 'ascore'}).text
if match_outcome(True): # If teams have past matches
print(team_name, '%s %s:%s %s' % (opp1, opp1_score, opp2_score, opp2))
# get the next page if there is one here
page = paginate.find('a', {'class':'selected'})
if page:
next_page = page.find_next_sibling()
if next_page:
print(next_page.get('href'))
# just append this to a list or add it to whatever you use to
# track the next url to crawl
next_url = next_page.get('href')