我有一个名为wordCounts
的词典,它将单词映射到它出现的次数,如何在dict中获得最高n
个单词,同时允许超过n
是一个平局?
答案 0 :(得分:2)
正如之前的回答所说,您可以转换为Counter
以使此数据集更易于处理。
>>> from collections import Counter
>>> d = {"d":1,"c":2,"a":3,'b':3,'e':0,'f':1}
>>> c = Counter(d)
>>> c
Counter({'b': 3, 'a': 3, 'c': 2, 'f': 1, 'd': 1, 'e': 0})
Counter
有most_common(n)
方法,该方法将采用n
最常见的元素。请注意,它将排除关系。因此:
>>> c.most_common(4)
[('b', 3), ('a', 3), ('c', 2), ('f', 1)]
要包含所有等于第n个元素的值,您可以执行以下操作,而无需转换为Counter
。这非常麻烦,但应该可以解决这个问题。
from collections import Counter
def most_common_inclusive(freq_dict, n):
# find the nth most common value
nth_most_common = sorted(c.values(), reverse=True)[n-1]
return { k: v for k, v in c.items() if v >= nth_most_common }
您可以按如下方式使用:
>>> d = {'b': 3, 'a': 3, 'c': 2, 'f': 1, 'd': 1, 'e': 0}
>>> most_common_inclusive(d, 4)
{'d': 1, 'b': 3, 'c': 2, 'f': 1, 'a': 3}
答案 1 :(得分:1)
一种解决方案可能是:
from collections import Counter, defaultdict
list_of_words = ['dog', 'cat', 'moo', 'dog', 'pun', 'pun']
def get_n_most_common(n, list_of_words):
ct = Counter(list_of_words)
d = defaultdict(list)
for word, quantity in ct.items():
d[quantity].append(word)
most_common = sorted(d.keys(), reverse= True)
return [(word, val) for val in most_common[:n] for word in d[val]]
测试:
>> get_n_most_common(2, list_of_words)
=> [('pun', 2), ('dog', 2), ('moo', 1), ('cat', 1)]
>> get_n_most_common(1, list_of_words)
=> [('pun', 2), ('dog', 2)]
答案 2 :(得分:1)
MooingRawr走在正确的轨道上,但现在我们需要获得最高的n
结果
l = []
for i, (word, count) in enumerate(sorted(d.items(), reverse=True, key=lambda x: x[1])):
if i >= n and count<l[-1][1]:
break
l.append((word, count))