Question

因此，此代码可以获得所有比赛结果，团队与队伍以及比赛得分。例如像http://www.gosugamers.net/counterstrike/teams/7395-mousesports-cs/matches这样的团队。但是这段代码只获得了第一页的结果，我试图获得每个页面的所有结果。问题是有些团队没有下一页按钮，所以当我尝试实现该代码时程序崩溃了。我如何编写代码以获取下一页并继续获得结果，如果团队匹配链接没有下一页只是继续？

def all_match_outcomes():
    for match_outcomes in match_history_url():
        rest_server(True)
        page = requests.get(match_outcomes).content
        soup = BeautifulSoup(page, 'html.parser')

        team_name_element = soup.select_one('div.teamNameHolder')
        team_name = team_name_element.find('h1').text.replace('- Team Overview', '')

        for match_outcome in soup.select('table.simple.gamelist.profilelist tr'):
            opp1 = match_outcome.find('span', {'class': 'opp1'}).text
            opp2 = match_outcome.find('span', {'class': 'opp2'}).text

            opp1_score = match_outcome.find('span', {'class': 'hscore'}).text
            opp2_score = match_outcome.find('span', {'class': 'ascore'}).text

            if match_outcome(True):  # If teams have past matches
                print(team_name, '%s %s:%s %s' % (opp1, opp1_score, opp2_score, opp2))

Answer 1

在将游戏分数拉出桌面的for循环之后，你可以抓住分页链接。

使用此代码，您可以通过查找当前选定的页面来获取下一页。如果没有超出当前选定页面的页面（当前）将打印“未找到页面”。

paginate = soup.find('div', {'class':'paginator'})

page = paginate.find('a', {'class':'selected'})

next_page = page.find_next_sibling()
if next_page:
    print(next_page.get('href'))
else:
    print("no page found")

修改

回应评论;这就是我想用这段代码的方法。然后它将被添加，你可以继续循环。

def all_match_outcomes(): for match_outcomes in match_history_url(): rest_server(True) page = requests.get(match_outcomes).content soup = BeautifulSoup(page, 'html.parser') team_name_element = soup.select_one('div.teamNameHolder') team_name = team_name_element.find('h1').text.replace('- Team Overview', '') for match_outcome in soup.select('table.simple.gamelist.profilelist tr'): opp1 = match_outcome.find('span', {'class': 'opp1'}).text opp2 = match_outcome.find('span', {'class': 'opp2'}).text opp1_score = match_outcome.find('span', {'class': 'hscore'}).text opp2_score = match_outcome.find('span', {'class': 'ascore'}).text if match_outcome(True): # If teams have past matches print(team_name, '%s %s:%s %s' % (opp1, opp1_score, opp2_score, opp2)) # get the next page if there is one here page = paginate.find('a', {'class':'selected'}) if page: next_page = page.find_next_sibling() if next_page: print(next_page.get('href')) # just append this to a list or add it to whatever you use to # track the next url to crawl next_url = next_page.get('href')

如何只为拥有它的页面获取下一页结果？

1 个答案: