Selenium如何在python中按class_name定位具有相同类名的多个(第二,第三,第四等)元素

时间:2019-01-24 12:03:27

标签: python selenium web-scraping

我只想知道如何用相同的类名'linescores-table'调用下一个表,因为只有第一个表,我才能进入第二个表

我尝试制作此代码,但获取错误。将s放入元素

[table = driver.find_elements_by_class_name("linescores-table")]

这是我要与硒一起废弃的表 完整的代码,您可以尝试自己仔细地了解我的问题

from selenium import webdriver
from pandas import *
import pandas as pd
import numpy as np

path_to_chromedriver = 'chromedriver.exe' 
driver = webdriver.Chrome(executable_path=path_to_chromedriver)

url = 'https://stats.nba.com/scores/01/23/2019'
driver.get(url)

table = driver.find_element_by_class_name("linescores-table")

team1_quarter_score= []
team2_quarter_score= []

for line_id,lines in enumerate(table.text.split('\n')):
    if line_id == 0:
        column_names = lines.split(' ')[1:]
    else:
        if line_id % 2 == 1:
            team1_quarter_score.append(lines.split(' '))
        if line_id % 2 == 0:
            team2_quarter_score.append(lines.split(' '))

df1 = pandas.DataFrame({'teams': [i[0] for i in team1_quarter_score],
                       'standing': [i[1] for i in team1_quarter_score],
                       'q1': [i[2] for i in team1_quarter_score],
                       'q2': [i[3] for i in team1_quarter_score],
                       'q3': [i[4] for i in team1_quarter_score],
                       'q4': [i[5] for i in team1_quarter_score],
                       'finalScore':[i[6] for i in team1_quarter_score]
                                                  }
                     )
df2 = pandas.DataFrame({'teams': [i[0] for i in team2_quarter_score],
                       'standing': [i[1] for i in team2_quarter_score],
                       'q1': [i[2] for i in team2_quarter_score],
                       'q2': [i[3] for i in team2_quarter_score],
                       'q3': [i[4] for i in team2_quarter_score],
                       'q4': [i[5] for i in team2_quarter_score],
                       'finalScore':[i[6] for i in team2_quarter_score]
                                                  }
                     )
df=df1.append(df2)
print(df)

期望以相同的类名收集第二个表中的所有数据,但我只会得到第一个表。

这是代码的实际输出。

  teams standing  q1  q2  q3  q4 finalScore
0   TOR    36-14  16  31  28  31        106
1   IND    32-15  24  35  25  26        110

我的预期输出将是2张桌子

  teams standing  q1  q2  q3  q4 finalScore
0   TOR    36-14  16  31  28  31        106
1   IND    32-15  24  35  25  26        110

  teams standing  q1  q2  q3  q4 finalScore
0   CLE     9-40  30  20  30  23        103
1   BOS    30-18  27  38  27  31        123

我的整个更新工作代码感谢您对^^

的帮助
from selenium import webdriver

from pandas import *
import pandas as pd
import numpy as np

path_to_chromedriver = 'chromedriver.exe' 
driver = webdriver.Chrome(executable_path=path_to_chromedriver)

url = 'https://stats.nba.com/scores/01/23/2019'
driver.get(url)

tables = driver.find_elements_by_class_name("linescores-table")

for table in tables:
    team1_quarter_score= []
    team2_quarter_score= []

for line_id,lines in enumerate(table.text.split('\n')):
    if line_id == 0:
        column_names = lines.split(' ')[1:]
    else:
        if line_id % 2 == 1:
            team1_quarter_score.append(lines.split(' '))
        if line_id % 2 == 0:
            team2_quarter_score.append(lines.split(' '))

df1 = pd.DataFrame({'teams': [i[0] for i in team1_quarter_score],
                       'standing': [i[1] for i in team1_quarter_score],
                       'q1': [i[2] for i in team1_quarter_score],
                       'q2': [i[3] for i in team1_quarter_score],
                       'q3': [i[4] for i in team1_quarter_score],
                       'q4': [i[5] for i in team1_quarter_score],
                       'finalScore':[i[6] for i in team1_quarter_score]
                                                  }
                     )
df2 = pd.DataFrame({'teams': [i[0] for i in team2_quarter_score],
                       'standing': [i[1] for i in team2_quarter_score],
                       'q1': [i[2] for i in team2_quarter_score],
                       'q2': [i[3] for i in team2_quarter_score],
                       'q3': [i[4] for i in team2_quarter_score],
                       'q4': [i[5] for i in team2_quarter_score],
                       'finalScore':[i[6] for i in team2_quarter_score]
                                                  }
                     )
df=df1.append(df2)
print(df)

1 个答案:

答案 0 :(得分:3)

您的代码仅获取第一个表。您已与其他尝试接近。尝试:

tables = driver.find_elements_by_class_name("linescores-table")

应该为您提供所有要使用的表。只需将其余代码包装在for循环中,即可对所有表执行相同的工作:

for table in tables:
    team1_quarter_score= []
    ...