Question

我正在尝试从https://www.npr.org/sections/thetwo-way/archive获取打印NPR标题，但我的代码无效。我使用的是Python3和Selenium ChromeDriver。这就是我现在所拥有的：

from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains


custom_path = "/Users/ashkij/Desktop/"
driver = webdriver.Chrome("/Users/ashkij/Desktop/chromedriver")


#Open a page that has a list of NPR headlines.
driver.get("https://www.npr.org/sections/thetwo-way/archive")
#After the first few articles, one has to scroll to the bottom of the page 

driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

for i in range(1,10):
    #Get each article from an XPATH expression
    article_headline = driver.find_element_by_xpath("""//*[@id="infinitescroll"]/article[{}]/div[2]/h2/a""".format(i))
    print(article_headline.text)

在第一篇文章中，我收到此错误：

selenium.common.exceptions.NoSuchElementException: Message: no such element: 
Unable to locate element: {"method":"xpath","selector":"//*
[@id="infinitescroll"]/article[1]/div[2]/h2/a"}

但是，我确认以上是给定文章的XPath表达式，所以我不知道为什么Selenium说XPath表达式无效。

Answer 1

尝试以下代码：

headlines = driver.find_elements_by_css_selector(".title>a")

for headline in headlines:

    print(headline.text)

希望它可以帮到你！

使用Selenium获取NPR头条新闻

1 个答案: