Python:如何按行分组行并按另一列选择一行?

时间:2013-07-01 07:43:01

标签: python csv merge

我有一个像这样的CSV文件:

student | score
John    |  A
John    |  C
Mary    |  B
Mary    |  D
Kim     |  B
Kim     |  A

每位学生都有多个分数,我希望将得分信息合并到具有最高分数的独特学生之下。

我希望在结果中有这样的表:

student | score
John    | A
Mary    | B
Kim     | A

我试图找到关于此的帖子,但失败了。有没有办法使用内置库来做到这一点?

2 个答案:

答案 0 :(得分:2)

使用itertools.groupby按学生姓名分组。

import csv
import itertools
import operator

with open('1.csv') as f, open('2.csv', 'w') as fout:
    reader = csv.DictReader(f, delimiter='|')
    writer = csv.DictWriter(fout, fieldnames=reader.fieldnames, delimiter='|')
    writer.writeheader()
    for student, group in itertools.groupby(reader, key=operator.itemgetter('student')):
        max_score = min(map(operator.itemgetter('score'), group))
        writer.writerow({'student': student, 'score': max_score})

答案 1 :(得分:1)

使用字典,并仅存储到目前为止找到的最高值。因为分数是以字母形式给出的,这意味着您需要以词汇方式找到“最低”的字母:

import csv

students = {}

with open(inputcsvfile, 'rb') as scoressource:
    reader = csv.reader(scoressource)
    for name, score in reader:
        if score < students.get(name, 'Z'):
            students[name] = score

with open(outputcsvfile, 'wb') as scoresdest:
    writer = csv.writer(scoresdest)
    for name, score in students.iteritems():
        writer.writerow([name, score])