Selenium find_elements By.XPATH尝试提取href url错误

时间:2020-09-17 13:20:15

标签: python selenium selenium-webdriver selenium-firefoxdriver

我要使用Firefox Webdriver,从 {% for key,visit in value_array_from_controller%} <tr> <td>{{ Key }}</td> <td>{{ visit.startDate | date }}</td> <td>{{ visit.patient.firstName}} {{ visit.patient.lastName }}</td> </tr> {% endfor %} 中提取所有包含单词的URL。 我正在使用最新的硒二进制。 试过这个:

a href

但出现类型错误:

driver = webdriver.Firefox()
driver.get(url)
nodes = driver.find_elements(By.XPATH, "//a[contains(@href,'products')]/@href")
print("nodes: ", nodes)
links = []
for elem in nodes:
    links.append(elem)

还尝试了selenium.common.exceptions.WebDriverException: Message: TypeError: Expected an element or WindowProxy, got: [object Attr href="https://www.example.com/catalogue/products/a.html"] 然后每个都使用driver.find_elements(By.XPATH, "//a[contains(@href,'products')]"),但效果不尽人意。

不知道错误在哪里以及如何解决这个问题。

html的摘录:

getAttribute("href")

1 个答案:

答案 0 :(得分:0)

要使用Selenium提取href属性,您必须为visibility_of_all_elements_located()引出WebDriverWait,并且可以使用以下任一{{3 }}:

  • 使用CSS_SELECTOR

    print([my_elem.get_attribute("href") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "a[href*='products']")))])
    
  • 使用XPATH

    print([my_elem.get_attribute("href") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//a[contains(@href,'products')]")))])
    
  • 注意:您必须添加以下导入:

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC