Question

我的代码如下：

{{1}}

所有数据都打印到屏幕上，但是只有一部分到达了我的CSV文件。我是新手，因此不胜感激。

我想知道是否像本文（Pandas prints to screen corrently but saves only some data to csv）一样试图一次捕获太多数据，但是解释超出了我的技术水平。

Answer 1

artist_csv_file = open('artist_data.csv', 'w')

由于“ w”，此行会在每个循环中覆盖文件。尝试使用“ a”进行附加，然后它将结果evrey循环附加在文件末尾。

您可能应该在循环之前初始化文件以添加列标题，否则它可能会在每个循环中写一行新的标题。

[...]
urls = csv.reader(csvf)

#create/clean artist_data.csv and insert column headers 
with open ('artist_data.csv', 'w+') as artist_csv_file:
    csv_writer = csv.writer(artist_csv_file)
    csv_writer.writerow(['date_text', 'artist', 'track', 'url'])

# this opens the file once for writing the column headers, this first opening with 'w+' makes sure previous content gets cleaned
# if the file is always empty when you run the program you could all in one context manager with 'a+'. 


#now open the csv in append mode and do the scraping
with open('artist_data.csv', 'a') as artist_csv_file:

    csv_writer = csv.writer(artist_csv_file)

    for url in urls:

        [...]
        # these lines should be removed from the loop-body
        artist_csv_file = open('artist_data.csv', 'w')
        csv_writer = csv.writer(artist_csv_file)
        csv_writer.writerow(['date_text', 'artist', 'track', 'url'])
        [...]


# no close statement at the end needed.

希望有帮助。玩得开心！

BeautifulSoup抓取的数据会打印到屏幕上，但不会保存

1 个答案: