我刚刚开始学习csv模块。假设我们有这个CSV文件:
John,Jeff,Judy,
21,19,32,
178,182,169,
85,74,57,
我们想要读取此文件并创建一个包含名称(作为键)和每列总数(作为值)的字典。所以在这种情况下,我们最终会得到:
d = {"John" : 284, "Jeff" : 275, "Judy" : 258}
所以我写了这段代码,显然效果很好,但我对此并不满意,并且想知道是否有人知道更好或更有效/更优雅的方式。因为那里有太多的线:D(或许我们可以稍微概括一下 - 即我们不知道有多少个字段。)
d = {}
import csv
with open("file.csv") as f:
readObject = csv.reader(f)
totals0 = 0
totals1 = 0
totals2 = 0
totals3 = 0
currentRowTotal = 0
for row in readObject:
currentRowTotal += 1
if currentRowTotal == 1:
continue
totals0 += int(row[0])
totals1 += int(row[1])
totals2 += int(row[2])
if row[3] == "":
totals3 += 0
f.close()
with open(filename) as f:
readObject = csv.reader(f)
currentRow = 0
for row in readObject:
while currentRow <= 0:
d.update({row[0] : totals0})
d.update({row[1] : totals1})
d.update({row[2] : totals2})
d.update({row[3] : totals3})
currentRow += 1
return(d)
f.close()
非常感谢任何答案:)
答案 0 :(得分:3)
不确定您是否可以使用pandas,但您可以按照以下方式获取dict:
import pandas as pd
df = pd.read_csv('data.csv')
print(dict(df.sum()))
给出:
{'Jeff': 275, 'Judy': 258, 'John': 284}
答案 1 :(得分:0)
使用顶行来确定列标题是什么。根据标题初始化总计字典。
import csv
with open("file.csv") as f:
reader = csv.reader(f)
titles = next(reader)
while titles[-1] == '':
titles.pop()
num_titles = len(titles)
totals = { title: 0 for title in titles }
for row in reader:
for i in range(num_titles):
totals[titles[i]] += int(row[i])
print(totals)
我要补充一点,您不必在with
阻止后关闭该文件。 with
的重点在于它负责关闭文件。
另外,我要提一下,您发布的数据似乎有四列:
John,Jeff,Judy,
21,19,32,
178,182,169,
85,74,57,
这就是我这样做的原因:
while titles[-1] == '':
titles.pop()
答案 2 :(得分:0)
这有点脏,但试试这个(没有空的最后一栏操作):
#!/usr/bin/python
import csv
import numpy
with open("file.csv") as f:
reader = csv.reader(f)
headers = next(reader)
sums = reduce(numpy.add, [map(int,x) for x in reader], [0]*len(headers))
for name, total in zip(headers,sums):
print("{}'s total is {}".format(name,total))
答案 3 :(得分:0)
基于Michasel的解决方案,我会尝试使用更少的代码和更少的变量,而不依赖于Numpy
:
import csv
with open("so.csv") as f:
reader = csv.reader(f)
titles = next(reader)
sum_result = reduce(lambda x,y: [ int(a)+int(b) for a,b in zip(x,y)], list(reader))
print dict(zip(titles, sum_result))