我一直在尝试从网页上抓取youtube链接,但没有任何效果。 This is a picture of what I've been trying to scrape.
这是我最近尝试过的代码:
youtube_link = soup.find("a", class_="ytp-title-link yt-uix-sessionlink")
这是youtube链接所在网站的链接:https://www.electronic-festivals.com/event/i-am-hardstyle-germany
我真的需要这个来工作。预先感谢。
答案 0 :(得分:0)
大多数youtube链接都位于iframe
内,并且还需要运行javascript。尝试使用硒。以下内容提取包含src
e的任何href
或youtub
。我只输入托管YouTube剪辑的关键iframe。您可以循环所有iframes
检查。
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
def addItems(links, final):
for link in links:
ref = link.get_attribute('src') if link.get_attribute('src') is not None else link.get_attribute('href')
final.append(ref)
return final
url = "https://www.electronic-festivals.com/event/i-am-hardstyle-germany"
driver = webdriver.Chrome()
driver.get(url)
driver.switch_to.frame(driver.find_element_by_css_selector('.media-youtube-player'))
final = []
try:
links = WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, "[href*=youtube] , [src*=youtube]")))
addItems(links, final)
except:
pass
finally:
driver.switch_to.default_content()
links = driver.find_elements_by_css_selector('[href*=youtube] , [src*=youtube]')
addItems(links, final)
for link in set(final):
print(link)
driver.quit()
答案 1 :(得分:0)
如果您是通过抓取下载的意思,请尝试
pip install youtube-dl
在您的外壳中。