代码:
team1 = []
team2 = []
url = "https://www.basketball-reference.com/leagues/NBA_2019_games.html"
driver = webdriver.Chrome(executable_path=r"chromedriver.exe")
driver.implicitly_wait(30)
driver.get(url)
soup1 = BeautifulSoup(driver.page_source, 'lxml')
for i in range(len(soup1.find_all('a', href=True, text='Box Score'))):
driver.find_elements_by_link_text('Box Score')[i].click()
driver.implicitly_wait(10)
soup2 = BeautifulSoup(driver.page_source, 'lxml')
scorebox = soup2.find_all('div',{'class':'scorebox'})[0]
team1.append(scorebox.find_all('a', itemprop='name')[0].text.strip())
team2.append(scorebox.find_all('a', itemprop='name')[1].text.strip())
driver.implicitly_wait(10)
driver.execute_script("window.history.go(-2)")
错误:
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-2-b1355c7b0c2e> in <module>()
12 for i in range(len(soup1.find_all('a', href=True, text='Box Score'))):
13
---> 14 driver.find_elements_by_link_text('Box Score')[i].click()
15 driver.implicitly_wait(10)
16
IndexError: list index out of range
基本上,我想从每次比赛中抓取球队名称,但是在随机迭代之后它就停止了。 (我有时会得到10个样本计数,有时会得到50个样本计数,并且由于上述错误而停止。