输出内容未显示在终端中

时间:2019-07-06 07:20:11

标签: selenium web selenium-chromedriver screen-scraping

我正在尝试抓取中级帖子和内容。一切都很好,代码也可以运行,并打开浏览器,直接指向指定的URL。但是在输出屏幕上,它应该显示帖子名称,内容,作者姓名和其他打印内容。

所有的类名也是正确的。 然后我以为可能是因为动态内容永无止境,但是我将限制设置为变量输出,但仍然没有显示输出。

from selenium import webdriver 
from selenium.webdriver.common.by import By 
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC 
from selenium.common.exceptions import TimeoutException

option = webdriver.ChromeOptions()


browser = webdriver.Chrome(executable_path=r"C:/Users/Jai 
          Sipani/Downloads/chrome_driver/chromedriver.exe", 
          chrome_options=option)

browser.get("https://medium.com/topic/startups")


        # Wait 60 seconds for page to load
timeout = 60
try:
    WebDriverWait(browser, 
    timeout).until(EC.visibility_of_element_located((By.XPATH, 
    "//img[@class='n dx dy dz ea ed y']")))
except TimeoutException:
    print("Timed out waiting for page to load")
    browser.quit()

find_elements_by_xpath返回一个硒对象数组。

titles_heading = browser.find_elements_by_class_name("ar aj da bc db bd 
                 em gb gc at aw eo dg dh av")

titles_heading = titles_heading[:10]
titles = [x.text for x in titles_heading]
print('titles:')
print(titles, '\n')


titles_desc = browser.find_element_by_class_name("bh bi bc b bd be bf bg 
              at aw dj dg dh av ef ep")

titles_desc = titles_desc[:10]
desc = [i.text for i in titles_desc]
print('desc:')
print(desc, '\n')


authors = browser.find_element_by_class_name("bc b bd be bf bg at aw dj 
          dg dh av ar aj") 


authors = authors[:10]
author = [x.text for x in authors]
print('author: ')
print(author, '\n')

timeline = browser.find_element_by_class_name("fg ae fh")
timeline = timeline[:10]
time = [x.text for x in timeline]
print('time: ')
print(time, '\n')



for title, desc, author, time in zip(titles, titles_desc, authors, 
timeline):
    print("Title : title_Desc : authors : timeline")
    print(title + ": " + desc + ": "+ author + ": " + time, '\n')

我希望输出的是印刷文章和内容的列表,但没有得到。该脚本可以完美地在60秒内关闭会话。

0 个答案:

没有答案