我想对以下信息进行网上报废: https://rotogrinders.com/resultsdb/date/2019-01-13/sport/4/slate/5c3c66edb1699a43c0d7bba7/contest/5c3c66f2b1699a43c0d7bd0d
有一个主表带有列user
。当您点击user
时,旁边还有另一个表格,显示该用户参加比赛的团队信息。我想提取所有用户的团队。因此,我需要能够通过以下方式遍历所有用户:单击,然后在第二个表中提取信息。这是我提取第一个用户团队的代码:
from selenium import webdriver
import csv
from selenium.webdriver.support.ui import Select
from datetime import date, timedelta
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
chromedriver =("C:/Users/Michel/Desktop/python/package/chromedriver_win32/chromedriver.exe")
driver = webdriver.Chrome(chromedriver)
DFSteam = []
driver.get("https://rotogrinders.com/resultsdb/date/2019-01- 13/sport/4/slate/5c3c66edb1699a43c0d7bba7/contest/5c3c66f2b1699a43c0d7bd0d")
Team1=driver.find_element_by_css_selector("table.ant-table-fixed")
driver.close
print(Team1.text)
但是,我无法遍历不同的用户。我注意到,当我单击user
时,该行开关的tr类在页面源代码中从非活动状态变为活动状态,但是我不知道如何使用它。此外,我想将提取的团队存储在数据框中。我不确定同时执行此操作还是之后执行该操作是否更好。
数据框如下所示:
排名(团队)/ C / C / W / W / W / D / D / G / UTIL / TOTAL($)/总积分 1 /马克·舍费尔(Mark Scheifel / Mickael Backlund / Artemi Panarin)/尼克·福利尼奥(Nick Foligno)/迈克尔·弗洛里克(Michael Frolik)/马克·佐丹奴(Mark Giordano)/扎克·韦伦斯基(Cachon Hellebuyck)/布兰登·塔内夫(Brandon Tanev)/ 50 000 / 54.60
答案 0 :(得分:1)
您有正确的想法。只需找到用户名元素然后单击即可获取阵容表,重新格式化以组合为一个结果数据框。
用户名文本用<a>
标记。只需找到与用户名匹配的<a>
标签即可。
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as ec
import pandas as pd
url = 'https://rotogrinders.com/resultsdb/date/2019-01-13/sport/4/slate/5c3c66edb1699a43c0d7bba7/contest/5c3c66f2b1699a43c0d7bd0d'
# Open Browser and go to site
driver = webdriver.Chrome("C:/chromedriver_win32/chromedriver.exe")
driver.get(url)
# Waits until tables are loaded and has text. Timeouts after 60 seconds
WebDriverWait(driver, 60).until(ec.presence_of_element_located((By.XPATH, './/tbody//tr//td//span//a[text() != ""]')))
# Get tables to get the user names
tables = pd.read_html(driver.page_source)
users_df = tables[0][['Rank','User']]
users_df['User'] = users_df['User'].str.replace(' Member', '')
# Initialize results dataframe and iterate through users
results = pd.DataFrame()
for i, row in users_df.iterrows():
rank = row['Rank']
user = row['User']
# Find the user name and click on the name
user_link = driver.find_elements(By.XPATH, "//a[text()='%s']" %(user))[0]
user_link.click()
# Get the lineup table after clicking on the user name
tables = pd.read_html(driver.page_source)
lineup = tables[1]
#print (user)
#print (lineup)
# Restructure to put into resutls dataframe
lineup.loc[9, 'Name'] = lineup.iloc[9]['Salary']
lineup.loc[10, 'Name'] = lineup.iloc[9]['Pts']
temp_df = pd.DataFrame(lineup['Name'].values.reshape(-1, 11),
columns=lineup['Pos'].iloc[:9].tolist() + ['Total_$', 'Total_Pts'] )
temp_df.insert(loc=0, column = 'User', value = user)
temp_df.insert(loc=0, column = 'Rank', value = rank)
results = results.append(temp_df)
results = results.reset_index(drop=True)
driver.close()
输出:
print (results)
Rank User ... Total_$ Total_Pts
0 1 Canadaman101 ... $50,000.00 54.6
1 2 MayhemLikeMe27 ... $50,000.00 53.9
2 2 gunslinger58 ... $50,000.00 53.9
3 4 oilkings ... $48,600.00 53.6
4 5 TTB19 ... $50,000.00 53.4
5 6 Adamjloder ... $49,800.00 53.1
6 7 DollarBillW ... $49,900.00 52.6
7 8 Biglarry696 ... $49,900.00 52.4
8 8 tical1994 ... $49,900.00 52.4
9 8 rollem02 ... $49,900.00 52.4
10 8 kchoban ... $50,000.00 52.4
11 8 TBirdSCIL ... $49,900.00 52.4
12 13 manny716 ... $49,900.00 52.1
13 14 JayKooks ... $50,000.00 51.9
14 15 Cambie19 ... $49,900.00 51.4
15 16 mjh6588 ... $50,000.00 51.1
16 16 shanefriesen ... $50,000.00 51.1
17 16 mnfish42 ... $50,000.00 51.1
18 19 Pugsly55 ... $49,900.00 50.9
19 19 volpez7 ... $50,000.00 50.9
20 19 Scherr47 ... $49,900.00 50.9
21 19 Testosterown ... $50,000.00 50.9
22 23 markm22 ... $49,700.00 50.6
23 23 foreveryoung12 ... $49,800.00 50.6
24 23 STP_Picks ... $49,900.00 50.6
25 26 jibbinghippo ... $49,800.00 50.4
26 26 loumister35 ... $49,900.00 50.4
27 26 creels3 ... $50,000.00 50.4
28 26 JayKooks ... $50,000.00 51.9
29 26 mmeiselman731 ... $49,900.00 50.4
30 26 volpez7 ... $50,000.00 50.9
31 26 tommienation1 ... $49,900.00 50.4
32 26 jibbinghippo ... $49,800.00 50.4
33 26 Testosterown ... $50,000.00 50.9
34 35 nut07 ... $50,000.00 49.9
35 35 volpez7 ... $50,000.00 50.9
36 35 durfdurf ... $50,000.00 49.9
37 35 chupacabra21 ... $50,000.00 49.9
38 39 Mbermes01 ... $50,000.00 49.6
39 40 suerte41 ... $50,000.00 49.4
40 40 spliksskins77 ... $50,000.00 49.4
41 42 Andrewskoff ... $49,600.00 49.1
42 42 Alky14 ... $49,800.00 49.1
43 42 bretned ... $50,000.00 49.1
44 42 bretned ... $50,000.00 49.1
45 42 gehrig38 ... $49,700.00 49.1
46 42 d-train_91 ... $49,500.00 49.1
47 42 DiamondDallas ... $50,000.00 49.1
48 49 jdmre ... $50,000.00 48.9
49 49 Devosty ... $50,000.00 48.9
[50 rows x 13 columns]