我正在尝试使用Selenium和Python访问元素的文本。我可以很好地访问元素本身,但是当我尝试获取文本时不起作用。
这是我的代码:
from selenium import webdriver
driver = webdriver.Chrome() # I removed the path for my post, but there is one that works in my actual code
URL = "https://www.costco.com/laptops.html"
driver.get(URL)
prices = driver.find_elements_by_class_name("price")
print([price.text for price in prices])
如果运行此代码,则会得到: selenium.common.exceptions.StaleElementReferenceException:消息:stale元素引用:元素未附加到页面文档
但是,如果我要打印出元素本身,就没有问题。 我阅读了一些过时的有关过时元素异常的文章,但我不明白为什么在这种情况下它适用于我。为什么当我尝试访问文本时DOM会发生变化?为什么会这样?
答案 0 :(得分:-1)
结果证明,您只需要等待:
from selenium import webdriver
import time
driver = webdriver.Chrome() # I removed the path for my post, but there is one that works in my actual code
URL = "https://www.costco.com/laptops.html"
driver.get(URL)
time.sleep(3)
prices = driver.find_elements_by_class_name("price")
print([price.text for price in prices])
输出:
['$1,999.99', '$2,299.99', '', '', '$769.99', '', '$799.99', '$1,449.99', '$1,199.99', '$1,199.99', '$1,999.99', '$1,599.99', '$1,299.99', '$2,299.99', '$1,549.99', '$1,499.99', '$599.99', '$1,699.99', '$1,079.99', '$2,999.99', '$1,649.99', '$1,499.99', '$2,399.99', '$1,499.97', '$1,199.99', '$1,649.99', '$849.99', '']
执行此操作的正确方法是使用WebDriverWait
。 See
旧答案:
我不确定为什么会这样。但我建议您尝试BeautifulSoup
:
from selenium import webdriver
from bs4 import BeautifulSoup
driver = webdriver.Chrome() # I removed the path for my post, but there is one that works in my actual code
URL = "https://www.costco.com/laptops.html"
driver.get(URL)
soup = BeautifulSoup(driver.page_source)
divs = soup.find_all("div",{"class":"price"})
[div.text.replace("\t",'').replace("\n",'') for div in divs]
输出:
['$1,099.99',
'$399.99',
'$1,199.99',
'$599.99',
'$1,049.99',
'$799.99',
'$699.99',
'$949.99',
'$699.99',
'$1,999.99',
'$449.99',
'$2,699.99',
'$1,149.99',
'$1,599.99',
'$1,049.99',
'$1,249.99',
'$299.99',
'$1,799.99',
'$749.99',
'$849.99',
'$2,299.99',
'$999.99',
'$649.99',
'$799.99']