使用Python快速抓取动态网页

时间:2020-05-17 01:36:24

标签: python selenium web-scraping dynamic timing

我一直在测试赛车实时计时页面上的抓取数据。它每秒更新几次。 我一直在使用Selenium通过id来获取我想要的数据。 但是有20个驱动程序乘以11个统计信息。 这意味着在重新开始之前,大约需要3秒钟来获取该数据。 有没有更快的方法?

driverList = []
drivernums = ['01', '02', '03', '04', '05', '06', '07', '08', '09', '10', '11', '12', '13', '14', '15', '16', '17',
              '18', '19', '20']
for d in range(1, 22):
    driverList.append(
        {"Pos": 0, "Name": '', "Lap": 0, "Gap": 0, "Interval": 0, "Pits": 0, 'Tyres': 0, 'S1': 0, 'S2': 0, 'S3': 0,
         'Laptime': 0})

while True:
    sessiontimeremaining = driver.find_element_by_id('stats_si_time')
    safetycar = driver.find_element_by_id('stats_si_safetycar')
    racelaps = driver.find_element_by_id('stats_si_laps')
    for d in drivernums:
        driverList[int(d)]['Pos'] = driver.find_element_by_id(f'i_{d}_pos').text
        driverList[int(d)]['Name'] = driver.find_element_by_id(f'i_{d}_nick').text[0:3].upper()
        driverList[int(d)]['Lap'] = driver.find_element_by_id(f'i_{d}_lap').text
        driverList[int(d)]['Gap'] = driver.find_element_by_id(f'i_{d}_gap').text
        driverList[int(d)]['Interval'] = driver.find_element_by_id(f'i_{d}_int').text[:-1]
        driverList[int(d)]['Pits'] = driver.find_element_by_id(f'i_{d}_pits').text
        driverList[int(d)]['Tyres'] = driver.find_element_by_id(f'i_{d}_tyres').text
        driverList[int(d)]['S1'] = driver.find_element_by_id(f'i_{d}_s1').text
        driverList[int(d)]['S2'] = driver.find_element_by_id(f'i_{d}_s2').text
        driverList[int(d)]['S3'] = driver.find_element_by_id(f'i_{d}_s3').text
        driverList[int(d)]['Laptime'] = driver.find_element_by_id(f'i_{d}_t').text

0 个答案:

没有答案