Question

我有一系列配对（名称，分数）和重复的名字。我想获得每个名字的最高分。名称标签本身对于最终结果是可选的。这是一个有效的实施方案：

from collections import defaultdict
scores = (('eyal', 76), ('alex', 50), ('oded', 90), ('eyal', 100), ('alex', 99))
distinct = defaultdict(set)
for score in scores:
    distinct[score[0]].add(score[1])
max_scores = [max(distinct[k]) for k in distinct]
print (max_scores)

我想知道，这可以使用词典理解一步完成吗？

Answer 1

In [22]: dict(sorted(scores))
Out[22]: {'alex': 99, 'eyal': 100, 'oded': 90}

这是基于以下观察结果：一旦我们对元组进行排序，我们只想保留每个名称的最后元组，而dict()可以很好地完成。

可替换地，

In [16]: [max(vals) for _,vals in itertools.groupby(sorted(scores), lambda x:x[0])]
Out[16]: [('alex', 99), ('eyal', 100), ('oded', 90)]

这更详细，但也更一般。例如，它可以很容易地适应计算平均分数，而第一个解决方案则不能。

“group by”聚合函数与字典理解

1 个答案: