Question

我有一个格式为{（f1，f2）的计数器：计数}。当我在此运行Counter.most_common（）时，我得到了正确的结果，但我想过滤f_上的某些过滤器的most_common（）。例如f2 =＆＃39; A＆＃39;应该返回具有f2 =＆＃39; A＆＃39;的most_common元素。怎么做？

Answer 1

如果我们查看Counter的源代码，我们会看到它使用heapq保留O(n + k log n)，其中k是所需密钥的数量，n }是Counter的大小，而不是O(n log n)。

def most_common(self, n=None):
    '''List the n most common elements and their counts from the most
    common to the least.  If n is None, then list all element counts.

    >>> Counter('abcdeabcdabcaba').most_common(3)
    [('a', 5), ('b', 4), ('c', 3)]

    '''
    # Emulate Bag.sortedByCount from Smalltalk
    if n is None:
        return sorted(self.items(), key=_itemgetter(1), reverse=True)
    return _heapq.nlargest(n, self.items(), key=_itemgetter(1))

因为这超过O(n)，我们只需过滤计数器并获取其项目：

counts = Counter([(1, "A"), (2, "A"), (1, "A"), (2, "B"), (1, "B")])

Counter({(f1, f2): n for (f1, f2), n in counts.items() if f2 == "A"}).most_common(2)
#>>> [((1, 'A'), 2), ((2, 'A'), 1)]

虽然展开它可能会让它稍快一点，如果重要的话：

import heapq
from operator import itemgetter

filtered = [((f1, f2), n) for (f1, f2), n in counts.items() if f2 == "A"]
heapq.nlargest(2, filtered, key=itemgetter(1))
#>>> [((1, 'A'), 2), ((2, 'A'), 1)]

在Python中过滤了Counter的most_common（）

1 个答案: