是否有更多pythonic方式编写以下最大值函数:

时间:2012-01-21 21:52:58

标签: python

def greatest(values):
    value_generator = (v for k,v in values)
    max_value = max(value_generator)
    return (k for k,v in values if v == max_value)

sample_data = ( ('id1', 3), ('id2', 5), ('id3', 5) )
items = list( greatest(sample_data) ) # Should produce ['id2', 'id3']

MapReduce任何人?

5 个答案:

答案 0 :(得分:3)

您可以像这样计算max_value:

max_value = max(sample_data, key=lambda x: x[1])[1]

如评论中所述,您还可以使用itemgetter作为max()函数键:

import operator 
max_value = max(sample_data, key=operator.itemgetter(1))[1]

所以你的代码就是(使用itemgetter并直接返回一个列表):

import operator 
def greatest(values):
    max_value = max(values, key=operator.itemgetter(1))[1]
    return [k for k,v in values if v == max_value]

答案 1 :(得分:3)

内置max函数有一个可选的key参数,可以自定义数据排序。下面对数据元组中的第二项进行排序并返回最大值:

>>> sample_data = ('id1',3),('id2',5),('id3',5)
>>> def greatest(values):
...   m = max(values,key=lambda n: n[1])[1]
...   return [k for k,v in values if v==m]
...
>>> greatest(sample_data)
['id2', 'id3']

答案 2 :(得分:3)

试试这个:

from operator import itemgetter

def greatest(values):
    m = max(values, key=itemgetter(1))[1]
    return [k for k,v in values if v == m]

并像这样使用它:

>>> sample_data = (('id1', 3), ('id2', 5), ('id3', 5))
>>> greatest(sample_data)
['id2', 'id3']

答案 3 :(得分:2)

事实上,根据我的测试,你的版本greatest更快 - 不管怎么说:

>>> def greatest_orig(values):
...     value_generator = (v for k,v in values)
...     max_value = max(value_generator)
...     return (k for k,v in values if v == max_value)
... 
>>> def greatest_max_key(values):
...     max_value = max(values, key=itemgetter(1))[1]
...     return (k for k,v in values if v == max_value)
... 
>>> sample_data = tuple(('id' + str(i), random.randrange(0, 1000)) for i in range(10000))
>>> list(greatest_orig(sample_data)) == list(greatest_max_key(sample_data))
True
>>> %timeit list(greatest_orig(sample_data))
1000 loops, best of 3: 1.67 ms per loop
>>> %timeit list(greatest_max_key(sample_data))
1000 loops, best of 3: 1.74 ms per loop

当然,如果你不喜欢将你的生成器分配给一个名字,你可以直接将生成器传递给max - 比max(values, key=itemgetter(1))[1]更可读,恕我直言:

>>> def greatest_max_iter(values):
...     max_value = max((v for k, v in values))
...     return (k for k, v in values if v == max_value)
...                                                
>>> list(greatest_orig(sample_data)) == list(greatest_max_iter(sample_data))
True
>>> %timeit list(greatest_max_iter(sample_data))
1000 loops, best of 3: 1.67 ms per loop

Python允许你在做这样的事情时省略外部的parens:

>>> def greatest_max_iter(values):
...     max_value = max(v for k, v in values)
...     return (k for k, v in values if v == max_value)
... 

但由于我不理解的原因,这样做会稍微慢一些:

>>> %timeit list(greatest_max_iter(sample_data))
1000 loops, best of 3: 1.69 ms per loop

这些都是真正的微观优化,不太重要。但我认为可读性比max(v for k, v in values)更有利于max((v for k, v in values))max(values, key=itemgetter(1))[1]

答案 4 :(得分:1)

如果您想使用map

>>> sample_data = ( ('id1', 3), ('id2', 5), ('id3', 5) )
>>> max_value = max(sample_data, key=lambda x: x[1])
>>> map(lambda x: x[0], filter((lambda x: x[1]==max_value), sample_data))
['id2', 'id3']