我无法解析youtube播放列表的HTML。例如,当我检查“ https://www.youtube.com/playlist?list=PLS1QulWo1RIYmaxcEqw5JhK3b-6rgdWO_”的标签时。我看到了类名“ yt-simple-endpoint.style-scope.ytd-playlist-video-renderer ”。但这在我使用bs4选择元素时不起作用。但是,我在网上找到了另一段工作代码,该代码选择了以下类别“ pl-video-title-link ”。但是我无法在网页上找到此类,并且所有标签都没有此类?随附的是工作代码。任何帮助将不胜感激。
from bs4 import BeautifulSoup as bs
import requests
r = requests.get('https://www.youtube.com/playlist?
list=PLS1QulWo1RIYmaxcEqw5JhK3b-6rgdWO_')
page = r.text
soup = bs(page,'html.parser')
res = soup.find_all('a',{'class':'pl-video-title-link'})
for l in res:
print (l.get("href"))
答案 0 :(得分:1)
此页面使用JavaScript更改了其结构,但是您可以在下载时打印汤,然后查看视频链接的初始位置。在这种情况下,请在标签redirectPath
中使用类<tr>
:
pl-video
打印:
from bs4 import BeautifulSoup
import requests
url = 'https://www.youtube.com/playlist?list=PLS1QulWo1RIYmaxcEqw5JhK3b-6rgdWO_'
page = requests.get(url)
soup = BeautifulSoup(page.text, 'lxml')
for i, tr in enumerate(soup.select('tr.pl-video')):
print('{}. {}'.format(i + 1, tr['data-title']))
print('https://www.youtube.com' + tr.a['href'])
print('-' * 80)
答案 1 :(得分:0)
尝试一下:
<script async src="//pagead2.googlesyndication.com/pagead/js/adsbygoogle.js"></script>
<script>
(adsbygoogle = window.adsbygoogle || []).push({
google_ad_client: "ca-pub-3028420268489959",
enable_page_level_ads: true
});
</script>
答案 2 :(得分:-1)
<script async src="//pagead2.googlesyndication.com/pagead/js/adsbygoogle.js"></script>
<script>
(adsbygoogle = window.adsbygoogle || []).push({
google_ad_client: "ca-pub-9888657827081883",
enable_page_level_ads: true
});
</script>
答案 3 :(得分:-2)
<script async src="//pagead2.googlesyndication.com/pagead/js/adsbygoogle.js"></script>
<script>
(adsbygoogle = window.adsbygoogle || []).push({
google_ad_client: "ca-pub-4293441101275232",
enable_page_level_ads: true
});
</script>