我正在尝试获取一个简单的python函数,该函数将读取CSV文件并找到列和行的平均值。 该函数将检查第一行和每个标题的列 从字母“Q”开始,它将计算出的平均值 该列然后将其打印到屏幕上。然后为每一行 数据将计算学生列中所有项目的平均值 以'Q'开头。它会正常地计算这个平均值 最低的测验下降。它将为每个学生打印出两个值。
CSV文件包含学生的成绩,如下所示:
hw1 hw2 Quiz3 hw4 Quiz2 Quiz1
john 87 98 76 67 90 56
marie 45 67 65 98 78 67
paul 54 64 93 28 83 98
fred 67 87 45 98 56 87
到目前为止我的代码是这样但我不知道如何继续:
import csv
def practice():
newlist=[]
afile= input('enter file name')
a = open(afile, 'r')
reader = csv.reader(a, delimiter = ",")
for each in reader:
newlist.append(each)
y=sum(int(x[2] for x in reader))
print (y)
filtered = []
total = 0
for i in range (0,len(newlist)):
if 'Q' in [i][1]:
filtered.append(newlist[i])
return filtered
答案 0 :(得分:1)
我可以建议使用熊猫:
>>> import pandas as pd
>>> data = pd.read_csv('file.csv', sep=' *')
>>> q_columns = [name for name in data.columns if name.startswith('Q')]
>>> reduced_data = data[q_columns].copy()
>>> reduced_data.mean()
Quiz3 69.75
Quiz2 76.75
Quiz1 77.00
dtype: float64
>>> reduced_data.mean(axis=1)
john 74.000000
marie 70.000000
paul 91.333333
fred 62.666667
dtype: float64
>>> import numpy as np
>>> for index, column in reduced_data.idxmin(axis=1).iteritems():
... reduced_data.ix[index, column] = np.nan
>>> reduced_data.mean(axis=1)
john 83.0
marie 72.5
paul 95.5
fred 71.5
dtype: float64
答案 1 :(得分:0)
如果您更改.csv
格式,则会有更好的代码。然后我们可以轻松使用DictReader
。
name,hw1,hw2,Quiz3,hw4,Quiz2,Quiz1
john,87,98,76,67,90,56
marie,45,67,65,98,78,67
paul,54,64,93,28,83,98
fred,67,87,45,98,56,87
import numpy as np
from collections import defaultdict
import csv
result = defaultdict( list )
with open('grades.csv', 'r') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
for k in row:
if k.startswith('Q'):
result[ row['name'] ].append( int(row[k]) )
for name, lst in result.items():
print name, np.mean( sorted(lst)[1:] )
paul 95.5
john 83.0
marie 72.5
fred 71.5