使用python,我需要基于CSV文件中两列中的两个键来平均值

时间:2016-06-05 22:44:07

标签: python csv

我有一个包含3列的csv文件

TMC, EPOCH, Time
11C12, 1, 24
11C12, 1, 34
11C12, 2, 56
11C12, 2, 78
11C13, 1, 56
11C13, 2, 45
11C13, 2, 64
11C13, 3, 32
11C13, 3, 28

现在我想要average.py文件计算TMC,EPOCH的每个组合的平均时间并将其写入txt或csv文件

所需的输出是:

TMC, EPOCH, Average Time
11C12, 1, average value 
11C12, 2, average value
11C13, 1, average value
11C13, 2, average value
11C13, 3, average value

1 个答案:

答案 0 :(得分:0)

使用defaultdict使用forst two列作为键对元素进行分组,然后追加平均时间并写入新csv:

import csv
from collections import defaultdict

with open("in.csv") as f, open("average.csv", "w") as out:
    wr = csv.writer(out)
    d = defaultdict(list)
    head = next(f)
    out.write(head)
    for row in csv.reader(f):
        d[tuple(row[:2])].append(int(row[-1]))

    for k, v in d.items():
        out.write("{},{},{}\n".format(k[0], k[1], sum(v, 0.0) / len(v)))

输出:

TMC,EPOCH,Time
11C12,1,29.0
11C12,2,67.0
11C13,1,56.0
11C13,2,54.5
11C13,3,30.0

如果您想保持订单首次看到元素,可以使用OrderedDict

import csv
from collections import OrderedDict

with open("in.csv") as f, open("average.csv", "w") as out:
    wr = csv.writer(out)
    d = OrderedDict()
    head = next(f)
    out.write(head)
    for row in csv.reader(f):
        d.setdefault(tuple(row[:2]), []).append(int(row[-1]))

    for k, v in d.items():
        out.write("{},{},{}\n".format(k[0], k[1], sum(v, 0.0) / len(v)))