Counter.most_common有点误导

时间:2015-11-18 21:47:56

标签: python python-3.x

min_number_of_tops = 3
c = Counter(['x','b','c','d','e','x','b','c','d','x','b'])
print(c.most_common(min_number_of_tops))

输出:

[('b', 3), ('x', 3), ('c', 2)] 

或:

[('b', 3), ('x', 3), ('d', 2)] 

但是如果most_common返回了类似的内容,我会更喜欢:

[('b', 3), ('x', 3), ('d', 2), ('c', 2)]

因为我有兴趣包含给定计数的所有元素。

无论如何,我需要生成一个表示最高结果的排序列表,但也要返回具有相同第三项计数的任何其他项目。例如:

['x', 'b', 'c', 'd']

以下是我尝试制作此列表:

def elements_above_cutoff(elem_value_pairs, cutoff):
    ''' presuming that the pairs are sorted from highest value to lowest '''
    for e,v in elem_value_pairs:
        if v >= cutoff:
            yield e
        else:
            return

min_number_of_tops = 3
c = Counter(['x','b','c','d','e','x','b','c','d','x','b'])
print(list(elements_above_cutoff(c.most_common(),c.most_common(min_number_of_tops)[-1][1])))

给出了:

['b', 'x', 'd', 'c']

你能建议一个更好的方法吗?我正在使用python3。

1 个答案:

答案 0 :(得分:0)

这种简单的方法有效:

import collections
common = collections.Counter(['x','b','c','d','e','x','b','c','d','x','b']).most_common()
min_number_of_tops = 3
if len(common)>min_number_of_tops:
    freq = common[min_number_of_tops-1][1]
    for i in range(min_number_of_tops, len(common)):
        if common[i][1] < freq:
            common = common[:i]
            break
print(common)   

输出:

[('b', 3), ('x', 3), ('d', 2), ('c', 2)]

使用列表推导的稍微更精确的方式可能不太可读:

import collections
common = collections.Counter(['x','b','c','d','e','x','b','c','d','x','b']).most_common()
min_number_of_tops = 3
testfreq = common[min_number_of_tops-1][1] if len(common)>min_number_of_tops else 0
common = [x for x, freq in common if freq >= testfreq]
print(common) 

输出:

['b', 'x', 'd', 'c']