检查一对键是否已在字典中有值?

时间:2016-11-09 14:28:30

标签: python dictionary

我有一个文件,其中包含许多学生的姓名,分数和考试编号(按此顺序),并想知道每个学生哪个考试成绩最好(从1分到5分,1分是最佳分数。)有些学生可能只参加过一次考试,有些参加考试。文件如下所示:

student1,4.2,1
student2,1.02,1
student3,4.1,1
student4,2.089,1
student2,3.02,2
student3,2.54,2
student4,3.69,2
student5,1.34,2

我计划创建一个包含姓名,考试编号和分数的字典,然后检索最佳分数。我的代码如下所示:

with open('filename.csv') as f:
lines = f.readlines()

scores = {} #{ Name : { Exam_Number : score }

for line in lines:
   n = re.match(r"(.*)\,(.*)\,(.*)",line)
   student = n.group(1)
   score = n.group(2)
   exam_number = n.group.(3)
   scores[name] = { exam_number : score } #HERE IS THE PROBABLE ERROR

#Obtain the best score per student and the number of the exam
best_exam = {}
for name in scores:
    for num in scores[name]:
        for score in scores[name][num]:
             if name in best_sco:
                 for num_ext in best_sco[name]:
                      if best_sco[name][num] > num_ext:
                            best_sco[sample] = { num : amb }
                      else:
                            best_sco[name] = {num : amb }

我意识到每当我尝试包含新的exam_number:已存在名称的得分组合时,将删除为该特定名称存储的先前对。例如,如果我拨打学生4的分数,则只会出现与考试2相对应的分数,因为这是最后一个被阅读而前一个被覆盖的分数。有没有办法用配对键声明一个字典然后迭代所有可能的对,考虑到一些键(但没有对)可能会重复?

编辑---------------------------

同样的问题以稍微不同的方式(它可能会为熟悉Python和Perl的人们敲响钟声)。 Python中有Perl's Multidimensional Hashes的等价物吗?

4 个答案:

答案 0 :(得分:0)

我认为最好将csv文件存储为列表列表。然后使用itertools.groupbyname分组,过滤掉score最高的行。这是源代码。

import csv
import collections
import itertools
import operator

# Read a csv file as a list of lists
with open('test.csv', 'r') as f:        # name, scores and exam number 
    reader = csv.reader(f, delimiter=',')
    lists = [[row[0], float(row[1]), int(row[2])] for row in reader]


# Obtain the best score per student and the number of the exam
for k, g in itertools.groupby(sorted(lists, key=operator.itemgetter(0)), key=operator.itemgetter(0)):
    best_score = max(list(g), key=lambda x: x[1])

    print(best_score)
    # Output
    '''
    ['student1', 4.2, 1]
    ['student2', 3.02, 2]
    ['student3', 4.1, 1]
    ['student4', 3.69, 2]
    ['student5', 1.34, 2]
    '''

上一个回答:

通过实现perl的自动修复功能

来使用嵌套词典
class AutoVivification(dict):
    """Implementation of perl's autovivification feature."""
    def __getitem__(self, item):
        try:
            return dict.__getitem__(self, item)
        except KeyError:
            value = self[item] = type(self)()
            return value​

将行scores[name] = { exam_number : score }替换为

d = AutoVivification()
d[name][exam_number] = score

答案 1 :(得分:0)

你可以使用元组作为dict中的键,只要没有重复的学生/考试组合

best_sco[('student-name', 'exam')] = 'score'

答案 2 :(得分:0)

if not scores[name]:
    scores[name] = {exam_number, score}
else:
    scores[name][exam_number] = score


best_exam = {}
for name, person_results in scores.iteritems():
    best_exam_number = None
    best_score = None
    for exam_number, score in person_results.iteritems():
        if score > best_score:
            best_exam_number = exam_number
    best_exam[name] = {best_exam_number, best_score} 

答案 3 :(得分:0)

使用defaultdict`?将分数放在列表中然后检索最高分?并且你可以使用csv来读取文件本身,这将节省你必须正则表达式

from collections import defaultdict
from operator import itemgetter
import csv


with open('filename.csv') as f:
    lines = f.readlines()
    scores = defaultdict(list) #{ Name : { Exam_Number : score }
    reader = csv.reader(f, delimiter=",")
    for line in reader:
        student, score, exam = line
        scores[student].append({"exam": exam, "score": score}) # assuming they can only take an exam once


    for student, exams in scores.items():
        best = sorted(exams, key=itemgetter("score"), reverse=True)[0]
        print student, best