我正在尝试使用beautifulsoup从天才歌词中获取歌曲的歌词,但是当尝试打印歌词时,我没有输出。这是我的代码:
import requests
from bs4 import BeautifulSoup
songURL = requests.get("https://genius.com/Marshmello-and-bastille-happier-lyrics")
song = songURL.content
soup = BeautifulSoup(song, 'lxml')
lyrics = soup.find_all("section")
for lyr in lyrics:
for lyr1 in lyrics.select("p"):
print(lyr1.text)
为什么这不起作用,请问有人,因为我已经尝试了一段时间了。
答案 0 :(得分:1)
服务器似乎返回了该页面的两个版本:在一个版本中,标签带有class="song_body-lyrics"
,在另一个版本中则带有class="Lyrics__Container..."
。
此脚本尝试处理两种情况:
import requests
from bs4 import BeautifulSoup
url = 'https://genius.com/Marshmello-and-bastille-happier-lyrics'
soup = BeautifulSoup(requests.get(url).content, 'lxml')
for tag in soup.select('div[class^="Lyrics__Container"], .song_body-lyrics p'):
t = tag.get_text(strip=True, separator='\n')
if t:
print(t)
打印:
[Intro]
Lately, I've been, I've been thinking
I want you to be happier, I want you to be happier
[Verse 1]
...and so on.
答案 1 :(得分:0)
import requests
from bs4 import BeautifulSoup
songURL = requests.get("https://genius.com/Marshmello-and-bastille-happier-lyrics")
song = songURL.content
soup = BeautifulSoup(song, 'lxml')
final_lyrics = []
lyrics = soup.find('div', {'class': "lyrics"})
lyrics = lyrics.find_all('p')
for i in lyrics:
final_lyrics.append(i.text)
print(i)
答案 2 :(得分:0)
您应该获取位于特定div中的所有文本。您可以在浏览器中使用devtools
或viewsource
找到该特定的div。这里的特定div是<div class='lyrics'>
,此div的独特功能是其类,即“歌词”类,因此我们应该在HTML中找到该特定的div,然后打印该div中的所有文本。
import bs4 as bs
import urllib.request
source = urllib.request.urlopen('https://alirezaarabi.com/view-source_https___genius.com_Alessia-cara-ready-lyrics.html').read()
soup = bs.BeautifulSoup(source,'lxml')
print(soup.title.string)
for div in soup.find_all('div', class_='lyrics'):
print(div.text)
答案 3 :(得分:-1)
如果您查看实际的HTML源代码,则没有section
标签。结构实际上是这样的:
<div class="song_body column_layout" initial-content-for="song_body">
<div class="column_layout-column_span column_layout-column_span--primary">
<div class="song_body-lyrics">
<h2 class="text_label text_label--gray text_label--x_small_text_size u-top_margin">Happier Lyrics</h2>
<div initial-content-for="lyrics">
<div class="lyrics">
<!--sse-->
<p>[Intro]<br>
Lately, I've been, I've been thinking<br>
I want you to be happier, I want you to be happier<br>
<br>
...