我正在尝试使用Beautiful Soup对该网站 https://www.playtogga.com/leagues/5969cefb9dbb4f0001b3b539/players 进行一些网页抓取,并访问表格中的数据。我当前的代码是:
https://www.playtogga.com/leagues/5969cefb9dbb4f0001b3b539/import requests
from bs4 import BeautifulSoup
import pandas as pd
headers = {'User-Agent':
'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like
Gecko) Chrome/47.0.2526.106 Safari/537.36'}
page = "https://www.playtogga.com/leagues/5969cefb9dbb4f0001b3b539/players"
pageTree = requests.get(page, headers=headers)
pageSoup = BeautifulSoup(pageTree.content, 'html.parser')
Players = pageSoup.find_all("a", {"class": "player-list"})
#Let's look at the first name in the Players list.
Players[0].text
但这会产生以下错误:
IndexError Traceback (most recent call last)
<ipython-input-42-b6ae920c924b> in <module>()
1 #Let's look at the first name in the Players list.
----> 2 Players[0].text
IndexError: list index out of range
我在其他站点上使用了此代码,并且工作正常,当我检查type(Players)时,它会给我bs4.element.ResultSet,因此它似乎正在做某事。
我有什么想念的地方吗?我对此很陌生,所以我想我可能会错过一些非常明显的东西。谢谢!