如何从csv文件中的两列获得平均值

时间:2014-09-10 11:48:45

标签: python

我有一个包含2列的csv文件

rw1, 24
rw2, 34
rw3, 56
rw1, 78
rw2, 56
rw2, 45
rw2, 64
rw3, 32
rw1, 28

现在我想要average.py文件分别计算所有rw1,rw2和rw3的平均值并将其写入average.txt文件

rw1 - average value,
rw2 - average value, 
rw3 - average value

3 个答案:

答案 0 :(得分:2)

使用pandas,它有点简短:

import pandas as pd
df = pd.read_csv(file, header=None)

In [1]: df
Out[1]: 
     0   1
0  rw1  24
1  rw2  34
2  rw3  56
3  rw1  78
4  rw2  56
5  rw2  45
6  rw2  64
7  rw3  32
8  rw1  28

In [2]: df.groupby(df[0]).mean() # it groups on the column "0", and calculates the mean on the different group 
Out[2]: 
             1
0             
rw1  43.333333
rw2  49.750000
rw3  44.000000

希望这有帮助!

答案 1 :(得分:0)

给出读取csv并将它们转换为元组。然后sort将其用于Groupby

import itertools
import csv

fileLocation = 'newslot.csv'
with open(fileLocation,'rb') as f:
    r = csv.reader(f)
    lis=sorted([(i[0],i[1]) for i in r])
    for k,g in itertools.groupby(lis,key=lambda x:x[0]):
        g=list(g)
        print k,sum(int(i[1]) for i in g)/len(g)

答案 2 :(得分:0)

from itertools import groupby
from operator import itemgetter
import csv

def avg(lst):
    return sum(map(float, lst)) / len(lst)

def avgcsv(filename, k=0, v=1):
    with open(filename) as f:
        data = sorted(csv.reader(f, skipinitialspace=True), key=itemgetter(k))
    return ['%s - %g' % (name, avg(map(itemgetter(v), group)))
            for name, group in groupby(data, key=itemgetter(k))]

with open('average.txt', 'w') as f:
    f.write(',\n'.join(avgcsv('filename', 0, 1)))

输出

rw1 - 43.3333,
rw2 - 49.75,
rw3 - 44