我将从网络抓取工具中解析的内容写入CSV文件时遇到问题。
csvfile = open('names.csv', 'a+')
fieldnames = ['news_url','news_title','news_author','date_pub','date_up','news_desc']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
data_list = {
'news_url': news_url,
'news_title': news_title,
'news_author': news_author,
'date_pub': date_pub,
'date_up': date_up,
'news_desc': news_desc,
}
if '' not in data_list.values():
writer.writerow(data_list)
我得到如下文件的格式
https://s3-us-west-2.amazonaws.com/my-contents/names.csv
答案 0 :(得分:0)
您尚未包含数据的外观。我假设你有一个数据值列表。
从您拥有的代码中,data_list
只是一行。因此,它应该使用writerow()
的单个调用来编写,不需要迭代您创建的data_list
字典:
import csv
fieldnames = ['news_url', 'news_title', 'news_author', 'date_pub', 'date_up', 'news_desc']
with open('names.csv', 'w', newline='') as f_output:
csv_output = csv.DictWriter(f_output, fieldnames=fieldnames)
csv_output.writeheader()
data = [
['url1', 'title1', 'author1', '1/1/2018', '2/1/2018', 'blah1'],
['url2', 'title2', 'author2', '1/1/2018', '2/1/2018', 'blah2'],
['url3', 'title3', 'author3', '1/1/2018', '2/1/2018', 'blah3'],
['url4', 'title4', 'author4', '1/1/2018', '2/1/2018', 'blah4']]
for news_url, news_title, news_author, date_pub, date_up, news_desc in data:
data_list = {
'news_url': news_url,
'news_title': news_title,
'news_author': news_author,
'date_pub': date_pub,
'date_up': date_up,
'news_desc': news_desc}
csv_output.writerow(data_list)
哪个会创建names.csv
,其中包含:
news_url,news_title,news_author,date_pub,date_up,news_desc
url1,title1,author1,1/1/2018,2/1/2018,blah1
url2,title2,author2,1/1/2018,2/1/2018,blah2
url3,title3,author3,1/1/2018,2/1/2018,blah3
url4,title4,author4,1/1/2018,2/1/2018,blah4
如果您的数据与上面的data
非常相似,那么使用DictWriter
有点过分,因为您的数据已经是正确顺序和大小的元素列表。如果是这种情况,以下方法将更容易:
import csv
fieldnames = ['news_url', 'news_title', 'news_author', 'date_pub', 'date_up', 'news_desc']
with open('names.csv', 'w', newline='') as f_output:
csv_output = csv.writer(f_output)
csv_output.writerow(fieldnames)
data = [
['url1', 'title1', 'author1', '1/1/2018', '2/1/2018', 'blah1'],
['url2', 'title2', 'author2', '1/1/2018', '2/1/2018', 'blah2'],
['url3', 'title3', 'author3', '1/1/2018', '2/1/2018', 'blah3'],
['url4', 'title4', 'author4', '1/1/2018', '2/1/2018', 'blah4']]
csv_output.writerows(data)
这会产生相同的输出。他们都假设你使用的是Python 3.x.