我正在尝试从https://www.npr.org/sections/thetwo-way/archive获取打印NPR标题,但我的代码无效。我使用的是Python3和Selenium ChromeDriver。这就是我现在所拥有的:
from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
custom_path = "/Users/ashkij/Desktop/"
driver = webdriver.Chrome("/Users/ashkij/Desktop/chromedriver")
#Open a page that has a list of NPR headlines.
driver.get("https://www.npr.org/sections/thetwo-way/archive")
#After the first few articles, one has to scroll to the bottom of the page
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
for i in range(1,10):
#Get each article from an XPATH expression
article_headline = driver.find_element_by_xpath("""//*[@id="infinitescroll"]/article[{}]/div[2]/h2/a""".format(i))
print(article_headline.text)
在第一篇文章中,我收到此错误:
selenium.common.exceptions.NoSuchElementException: Message: no such element:
Unable to locate element: {"method":"xpath","selector":"//*
[@id="infinitescroll"]/article[1]/div[2]/h2/a"}
但是,我确认以上是给定文章的XPath表达式,所以我不知道为什么Selenium说XPath表达式无效。
答案 0 :(得分:0)
尝试以下代码:
headlines = driver.find_elements_by_css_selector(".title>a")
for headline in headlines:
print(headline.text)
希望它可以帮到你!