Question

def words(word,number):
    if number<len(word):
        result={}
        for key,value in word.items():
            common_num=sorted(set(word.values()), reverse=True)[:number]
            if value in common_num:
                result.update({key:value})
        word.clear()
        word.update(result)
        new_word_count={}
        common_word=[]
        common=[]
        for key, value in word.items():
            if value in common_word:
                common.append(value)
            common_word.append(value)
        new_word_count=dict(word)
        for key,value in new_word_count.items():
            if value in common:
                del word[key]

示例：

>>> word={'a': 2, 'b': 2, 'c' : 3, 'd: 3, 'e': 4, 'f' : 4, 'g' : 5}
>>> words(word,3)

我的输出：{'g'：5}

预期输出：{'g'：5，'e'：4，'f'：4}

知道为什么我得到这个输出

Answer 1

嗯，没有任何特殊的进口，有更简单的方法来完成你想要做的事情。你可以通过跟踪和存储保存的值，然后删除，然后重新添加，在你可以简化很多事情时参与其中。即使有解释性评论，这也要短得多：

def common_words(word_count, number):
    # Early out when no filtering needed
    if number >= len(word_count):
        return

    # Get the top number+1 keys based on their values
    top = sorted(word_count, key=word_count.get, reverse=True)[:number+1]

    # We kept one more than we needed to figure out what the tie would be
    tievalue = word_count[top.pop()]

    # If there is a tie, we keep popping until the tied values are gone
    while top and tievalue == word_count[top[-1]]:
        top.pop()

    # top is now the keys we should retain, easy to compute keys to delete
    todelete = word_count.keys() - top
    for key in todelete:
        del word_count[key]

有一些更好的方法可以避免在word_count中重复查找（排序items，而不是keys等），但这更容易理解IMO，以及word_count中的额外查找是有界的和线性的，所以这不是什么大问题。

Answer 2

虽然在评论中作者提到避免Counter()，但对于那些有兴趣了解如何应用它的人来说，这是@ShadowRanger建议的简短解决方案：

import collections as ct

word={'a': 2, 'b': 2, 'c' : 3, 'd': 3, 'e': 4, 'f' : 4, 'g' : 5}
words = ct.Counter(word)
words.most_common(3)
# [('g', 5), ('f', 4), ('e', 4)]

计算Python中的常用词

2 个答案: