我正在从api
抓取歌词并将其保存到csv
文件。
我正是这样做的:
with codecs.open('lyrics.csv', 'ab', encoding='utf8') as outputfile:
outwriter = csv.writer(outputfile)
for url in urls:
page_url = base_url + url
page = requests.get(page_url, headers=headers)
html = BeautifulSoup(page.text, "html.parser")
lyrics = html.find('div', class_='lyrics').get_text()
outwriter.writerow(lyrics)
现在我正试图弄清楚如何连续存储每个歌词,例如0
等二进制值。
每首歌词的印刷方式如下:
[Hook: John Mayer]
Bittersweet
You're gonna be the death of me
I don't want you, but I need you
I love you and hate you at the very same time
Bittersweet
[Verse 1: Kanye West]
See, what I want so much should never hurt this bad
Never did this before, that's what the virgin says
We've been generally warned, that's what the surgeon says
God, talk to me now, this is an emergency
And she claims she only with me for the currency
You cut me deep, bitch, cut me like surgery
And I was too proud to admit that it was hurting me
I'd never do that to you, at least purposely
We breaking up again, we making up again
But we don't love no more, I guess we fucking then
Have you ever felt you ever want to kill her?
And you mix them emotions with tequila
And you mix that with a little bad advice
On one of them bad nights, y'all have a bad fight
And you talk about her family, her aunts and shit
And she say, "Motherfucker, your momma's a bitch"
You know, domestic drama and shit, all the attitude
I'd never hit a girl, but I'll shake the shit out of you
But I'mma be the bigger man, big pimpin' like Jigga man
Oh, I guess I figure it's
[Hook: John Mayer]
Bittersweet
You're gonna be the death of me
I don't want you, but I need you
I love you, hate you at the very same time
Bittersweet
[Verse 2: Kanye West]
See, what I want so much should never hurt this bad
Never did this before, that's what the virgin says
We've been generally warned, that's what the surgeon says
God, talk to me now, this is an emergency
And my niggas said I shouldn't let it worry me
I need to focus on the girls we getting currently
But I been thinking and it got me back to sinking
And this relationship, it even got me back to drinking
Now this Hennessy, uh, it's gonna be the death of me
And I always thought that you having my child was our destiny
But I can't even vibe with you sexually
'Cause every time that I'd try, you would question me
Saying, "You fucking them girls, disrespecting me
You don't see how your lies is affecting me?
You don't see how our life was supposed to be?
And I never let a nigga get that close to me
And you ain't cracked up to what you were supposed to be
You always gone, you always be where them hoes will be"
And this the first time she ever spilled her soul to me
[Hook: John Mayer & Kanye West]
Bittersweet (I fucked up and I know it, G)
You're gonna be the death of me (I guess it's bittersweet poetry)
I don't want you, but I need you
I love you and hate you at the very same time
Bittersweet
You're gonna be the death of me
I don't want you, but I need you
I love you and hate you at the very same time
[Verse 3: Kanye West]
See, what I want so much should never hurt this bad
Never did this before, that's what the virgin says
We've been generally warned, that's what the surgeon says
God, talk to me now, this is an emergency
See, what I want so much should never hurt this bad
Never did this before, that's what the virgin says
We've been generally warned, that's what the surgeon says
God, talk to me now, this is an emergency
[Hook: John Mayer]
Bittersweet
You're gonna be the death of me
I don't want you, but I need you
I love you and hate you at the very same time
(Bittersweet
You're gonna be the death of me
I don't want you
...very same time)
如何设置分隔符以便将整个歌词放在一行中,用逗号分隔值0
?
答案 0 :(得分:0)
它适用于cvs.DictWriter
:
with codecs.open('classification/data/genius.csv', 'ab', encoding='utf8') as outputfile:
outwriter = csv.DictWriter(outputfile, fieldnames = ['lyrics', 'classification'])
writer.writeheader()
for url in urls:
page_url = base_url + url
page = requests.get(page_url, headers=headers)
html = BeautifulSoup(page.text, "html.parser")
lyrics = html.find('div', class_='lyrics').get_text()
outwriter.writerow({'lyrics':lyrics, 'classification':0})