我有一个包含3列的csv文件
TMC, EPOCH, Time
11C12, 1, 24
11C12, 1, 34
11C12, 2, 56
11C12, 2, 78
11C13, 1, 56
11C13, 2, 45
11C13, 2, 64
11C13, 3, 32
11C13, 3, 28
现在我想要average.py文件计算TMC,EPOCH的每个组合的平均时间并将其写入txt或csv文件
所需的输出是:
TMC, EPOCH, Average Time
11C12, 1, average value
11C12, 2, average value
11C13, 1, average value
11C13, 2, average value
11C13, 3, average value
答案 0 :(得分:0)
使用defaultdict
使用forst two列作为键对元素进行分组,然后追加平均时间并写入新csv:
import csv
from collections import defaultdict
with open("in.csv") as f, open("average.csv", "w") as out:
wr = csv.writer(out)
d = defaultdict(list)
head = next(f)
out.write(head)
for row in csv.reader(f):
d[tuple(row[:2])].append(int(row[-1]))
for k, v in d.items():
out.write("{},{},{}\n".format(k[0], k[1], sum(v, 0.0) / len(v)))
输出:
TMC,EPOCH,Time
11C12,1,29.0
11C12,2,67.0
11C13,1,56.0
11C13,2,54.5
11C13,3,30.0
如果您想保持订单首次看到元素,可以使用OrderedDict
:
import csv
from collections import OrderedDict
with open("in.csv") as f, open("average.csv", "w") as out:
wr = csv.writer(out)
d = OrderedDict()
head = next(f)
out.write(head)
for row in csv.reader(f):
d.setdefault(tuple(row[:2]), []).append(int(row[-1]))
for k, v in d.items():
out.write("{},{},{}\n".format(k[0], k[1], sum(v, 0.0) / len(v)))