列表中特定元素后的陈旧元素错误

时间:2021-05-10 11:47:39

标签: python selenium web-scraping

尝试从此页面获取轮胎的详细信息。 https://eurawheels.com/fr/catalogue/BBS

links = driver.find_elements_by_xpath('//div[@class="col-xs-1 col-md-3"]//a')
parent_window = driver.current_window_handle
x = 0
for j in range(len(links)):
    driver.execute_script('window.open(arguments[0]);', links[j])
    #scraping here
    if x == 0:
       driver.close()
       driver.switch_to.window(parent_window)
       x += 1
    else:
        driver.back()
    driver.refresh() #refresh page
    tyres = WebDriverWait(driver, 25).until(EC.visibility_of_all_elements_located((By.XPATH, '//div[@class="card-body text-center"]//a'))) #redefine links
    time.sleep(4)

它适用于 10 个链接,但随后链接变得陈旧。无法弄清楚需要更改什么。欢迎任何帮助。

2 个答案:

答案 0 :(得分:0)

您需要在执行 scroll element into the view 之前添加 driver.execute_script('window.open(arguments[0]);', links[j]),因为并非所有元素最初都加载到页面上。
所以你的代码应该如下所示:

from selenium.webdriver.common.action_chains import ActionChains
actions = ActionChains(driver)
links = driver.find_elements_by_xpath('//div[@class="col-xs-1 col-md-3"]//a')
parent_window = driver.current_window_handle
x = 0
for j in range(len(links)):
    actions.move_to_element(j).perform()
    driver.execute_script('window.open(arguments[0]);', links[j])
    #scraping here
    if x == 0:
       driver.close()
       driver.switch_to.window(parent_window)
       x += 1
    else:
        driver.back()
    driver.refresh() #refresh page
    tyres = WebDriverWait(driver, 25).until(EC.visibility_of_all_elements_located((By.XPATH, '//div[@class="card-body text-center"]//a'))) #redefine links
    time.sleep(4)

答案 1 :(得分:0)

试试这个:

import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

link = 'https://eurawheels.com/fr/catalogue/BBS'

with webdriver.Chrome() as driver:
    wait = WebDriverWait(driver,15)
    driver.get(link)

    linklist = wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR,".card-body > a")))
    for i,elem in enumerate(linklist):
        linklist[i].click()
        wait.until(EC.invisibility_of_element_located((By.CSS_SELECTOR,".spinner-border[role='status']")))
        time.sleep(2) #if you kick out this delay, your script will run very fast but you may end up getting same results multiple times.
        item = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR,"h3"))).text
        print(item)
        wait.until(EC.presence_of_element_located((By.CSS_SELECTOR,"h1.modal-title + button[class='close'][data-dismiss='modal']"))).click()
        driver.back()
相关问题