我正在尝试使用Python和selenium从这个website下载PDF幻灯片,但我认为幻灯片的链接仅在加载脚本后出现。我试着等待加载javascript,但它仍然没有找到任何东西。有什么想法吗?
import os, sys, time, random
import requests
from selenium import webdriver
from bs4 import BeautifulSoup
url = 'https://mila.umontreal.ca/en/cours/deep-learning-summer-school-2017/slides'
browser = webdriver.Chrome()
browser.get(url)
browser.implicitly_wait(3)
html = browser.page_source
links = browser.find_elements_by_class_name('flip-entry')
print(links)
browser.quit()
答案 0 :(得分:0)
原因是主页上没有链接。您正在获取IFrame内的链接。此IFrame指向https://drive.google.com/embeddedfolderview?hl=fr&id=0ByUKRdiCDK7-c0k1TWlLM1U1RXc#list
您可以直接在代码中浏览该网址而不是主页面。或者你可以切换到框架
browser.switch_to_frame(browser.find_element_by_class_name("iframe-class"))
links = browser.find_elements_by_css_selector('.flip-entry a')
for link in links:
print(link.get_attribute("href"))
答案 1 :(得分:0)
from bs4 import BeautifulSoup
from selenium import webdriver
url = 'https://mila.umontreal.ca/en/cours/deep-learning-summer-school-2017/slides'
browser = webdriver.Chrome()
browser.get(url)
browser.switch_to_frame(browser.find_element_by_class_name('iframe-class'))
links = browser.find_elements_by_class_name('.flip-entry a')
for link in links:
print(link.get_attribute("href"))
browser.quit()