Python排序和汇总CSV

时间:2016-03-15 16:10:15

标签: python csv datetime

我有这样的CSV文件:

  

Datetime,Usage1,Project1
  Datetime,Usage2,Project1
  Datetime,Usage3,Project2
  Datetime,Usage4,Project3

目标是总结每个项目的使用情况并得到如下报告:

  

PROJECT1:   Usage1   Usage2

     

Project2中:   Usage3

     

项目3:   Usage4

我从以下Python代码开始,但它无法正常工作:

#/usr/bin/python

# obtain all Project values into new list project_tags:

project_tags = []
ifile = open("file.csv","r")
reader = csv.reader(ifile)
headerline = ifile.next()
for row in reader:
    project_tags.append(str(row[2]))
ifile.close()

# obtain sorted and unique list and put it into a new list project_tags2
project_tags2 = []
for p in list(set(project_tags)):
    project_tags2.append(p)


# open CSV file again and compare it with new unique list
ifile2 = open("file.csv","r")
reader2 = csv.reader(ifile2)
headerline = ifile2.next()

# Loop through both new list and a CSV file, and if they matches sum it:

sum_per_project = sum_per_project + int(row[29])
for project in project_tags2:
    for row in reader2:
        if row[2] == project:
            sum_per_project = sum_per_project + int(row[1])

感谢任何输入!

提前致谢。

2 个答案:

答案 0 :(得分:1)

尝试以下代码段:

summary = {}

with open("file.csv", "r") as fp:
    for line in fp:
        row = line.rstrip().split(',')

        key = row[2]
        if key in summary:
            summary[key] += (row[1].strip(),)
        else:
            summary[key] = (row[1].strip(),)

for k in summary:
    print('{0}: {1}'.format(k, ' '.join(summary[k])))

根据csv文件中的示例数据,它将打印:

 Project1: Usage1 Usage2
 Project2: Usage3
 Project3: Usage4

答案 1 :(得分:0)

这是一种defaultdict的方法。

修改: 感谢@ Saleem提醒我with子句,我们只需要输出内容

from collections import defaultdict
import csv

summary = defaultdict(list)
with open(path, "r") as f:
    rows = csv.reader(f)
    header = rows.next()
    for (dte, usage, proj) in rows:
        summary[proj.strip()]+=[usage.strip()]

# I just realized that all you needed to do was output them:
for proj, usages in sorted(summary.iteritems()):
    print(
        "%s: %s" % (proj, ' '.join(sorted(usages)))
    )

将打印

Project1: Usage1 Usage2
Project2: Usage3
Project3: Usage4