我正在尝试获取已经在循环中的项目的当前网址
def get_financial_info(self):
chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.add_argument("--window-size=1920x1080")
driver = webdriver.Chrome(executable_path='/path/chromedriver')
driver.get("https://www.financialjuice.com")
try:
WebDriverWait(driver, 60).until(EC.visibility_of_element_located((By.XPATH, "//div[@class='trendWrap']")))
except TimeoutException:
driver.quit()
category_url = [a.get_attribute("href") for a in
driver.find_elements_by_xpath("//ul[@class='nav navbar-nav']/li[@class='text-uppercase']/a[@href]")]
for record in category_url:
driver.get(record)
item = {}
url_element = webdriver.find_elements_by_xpath("//p[@class='headline-title']/a[@href]")
for links in url_element:
driver.get(links.get_attribute("href"))
print driver.current_url
但我得到了第一个实际链接,但代码停止了,
selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: element is not attached to the page document
(Session info: headless chrome=62.0.3202.94)
(Driver info: chromedriver=2.33.506092 (733a02544d189eeb751fe0d7ddca79a0ee28cce4),platform=Linux 4.4.0-101-generic x86_64)
我尝试研究发生的事情,我意识到,webdriver打开第一个类别,选择第一个项目并获得实际链接并停止而不是返回到上一个URL,取第二个项目并获取下一个链接,直到循环结束。