我有一个像这样的CSV文件:
student | score
John | A
John | C
Mary | B
Mary | D
Kim | B
Kim | A
每位学生都有多个分数,我希望将得分信息合并到具有最高分数的独特学生之下。
我希望在结果中有这样的表:
student | score
John | A
Mary | B
Kim | A
我试图找到关于此的帖子,但失败了。有没有办法使用内置库来做到这一点?
答案 0 :(得分:2)
使用itertools.groupby按学生姓名分组。
import csv
import itertools
import operator
with open('1.csv') as f, open('2.csv', 'w') as fout:
reader = csv.DictReader(f, delimiter='|')
writer = csv.DictWriter(fout, fieldnames=reader.fieldnames, delimiter='|')
writer.writeheader()
for student, group in itertools.groupby(reader, key=operator.itemgetter('student')):
max_score = min(map(operator.itemgetter('score'), group))
writer.writerow({'student': student, 'score': max_score})
答案 1 :(得分:1)
使用字典,并仅存储到目前为止找到的最高值。因为分数是以字母形式给出的,这意味着您需要以词汇方式找到“最低”的字母:
import csv
students = {}
with open(inputcsvfile, 'rb') as scoressource:
reader = csv.reader(scoressource)
for name, score in reader:
if score < students.get(name, 'Z'):
students[name] = score
with open(outputcsvfile, 'wb') as scoresdest:
writer = csv.writer(scoresdest)
for name, score in students.iteritems():
writer.writerow([name, score])