为什么此特定网址的Python代码无法找到标记?在Chrome开发工具中,您可以看到该标记存在。我已经尝试过使用不同的等待而没有任何成功。
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as ec
from selenium.webdriver.support.ui import WebDriverWait
driver = webdriver.Chrome('E:/Work/IdeaProjects/web_loaders/movie_scraper/chromedriver.exe')
driver.implicitly_wait(3)
url = 'https://fmovies.is/film/gilmore-girls-4.krm6'
print('opening %s' % url)
driver.get(url)
content = driver.find_element_by_id('player').find_element_by_class_name('cover')
content.click()
print('after click')
src = WebDriverWait(driver, 12).until(
ec.presence_of_element_located((By.TAG_NAME, 'video'))
).get_attribute('src')
@DebanjanB,我跳过搜索视频的代码" Gilmore Girls"因为这个问题不关心。根据你的问题。请打开网站https://fmovies.is。进入搜索栏" Gilmore Girls"并按。单击找到的第一个项目。浏览器会打开网址https://fmovies.is/film/gilmore-girls-6.mwo7。在此页面中,您可以看到空"播放器"带播放按钮。请注意,当前打开的链接不是流媒体链接。请点击播放图标。浏览器将打开新链接并开始流式传输视频。我想提取最后一个链接。如果您在Chrome中按了"选择元素"按钮并选择流媒体播放器,然后您将在" Elements"标签。您要求的线路只是试图找到这个标签。在那里使用Selenium的显式等待
答案 0 :(得分:0)
此代码将帮助您获取您在示例中使用的网址的标记:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as ec
from selenium.webdriver.support.ui import WebDriverWait
driver = webdriver.Chrome()
driver.maximize_window()
url = 'https://gomovies.to/film/gilmore-girls-a-year-in-the-life-season-1-18045/'
print('opening %s' % url)
driver.get(url)
content = driver.find_element_by_class_name('mvi-cover')
content.click()
print('after click')
src = WebDriverWait(driver, 10).until(
ec.presence_of_element_located((By.XPATH, '//*[@id="media-player"]//video'))
)
print src.get_attribute('src')
输出将如下:
C:\Python27\python.exe C:/Users/osya.py
opening https://gomovies.to/film/gilmore-girls-a-year-in-the-life-season-1-18045/
after click
http://c5s1.vsharing.ru/movies09/Series/2016/11/27/Gilmore.Girls.2016.S01E01.720p.WEBRip.x264-TheRival.mp4?h=sd8ZYN1c0xLL5D8qmxARSg&e=1498145815