如何以CSV格式获取广告牌热100数据?

时间:2019-01-05 16:52:54

标签: python python-3.x

我是Python的新手,正在尝试使用Python 3刮擦广告牌数据。我正在使用python广告牌api(billboard.py),并希望以csv格式获取热门的100首曲目,并以广告牌编号,艺术家姓名,歌曲标题,上周编号,峰值位置和周作为标题。我已经看了好几个小时了,没有运气,所以我们将不胜感激!

import requests, csv
from bs4 import BeautifulSoup

url = 'http://www.billboard.com/charts/hot-100'

with open('Billboard_Hot100.csv', 'w', newline='') as f:
    writer = csv.writer(f)
    writer.writerow(['Billboard Number','Artist Name','Song Title','Last Week Number','peak_position','weeks_on_chart'])

res = requests.get(url)
soup = BeautifulSoup(res.text, "html.parser")

for container in soup.find_all("article",class_="chart-row"):

    billboard_number = container.find(class_="chart-row__current-week").text

    artist_name_a_tag = container.find(class_="chart-row__artist").text.strip()

    song_title = container.find(class_="chart-row__song").text

    last_week_number_tag = container.find(class_="chart-row__value")
    last_week_number = last_week_number_tag.text

    peak_position_tag = last_week_number_tag.find_parent().find_next_sibling().find(class_="chart-row__value")
    peak_position = peak_position_tag.text

    weeks_on_chart_tag = peak_position_tag.find_parent().find_next_sibling().find(class_="chart-row__value").text

    print(billboard_number,artist_name_a_tag,song_title,last_week_number,peak_position,weeks_on_chart_tag)
    writer.writerow([billboard_number,artist_name_a_tag,song_title,last_week_number,peak_position,weeks_on_chart_tag])

返回的csv文件具有标题,但列中不包含任何信息。我想念什么吗?

1 个答案:

答案 0 :(得分:0)

您可以处理吗?

from lxml import html
import requests
page = requests.get('https://www.billboard.com/charts/hot-100')
tree = html.fromstring(page.content)
line = tree.xpath('//span[@class="chart-list-item__title-text"]/text()')
line = line.replace('\n','')

这是我很久以前收藏的链接。一段时间以来,它对我有很大帮助。

https://docs.python-guide.org/scenarios/scrape/