Question

鉴于字典以及新字典中键数的限制，我希望新字典包含具有最高值的键。

给定的字典是：

dict = {'apple':5, 'pears':4, 'orange':3, 'kiwi':3, 'banana':1 }

我想得到一个新的字典，其中的键的长度限制最大。

例如对于limit = 1，新字典是

{'apple':5}

如果limit = 2

{'apple':5, 'pears':4}

我尝试过：

return dict(sorted(dictation.items(),key=lambda x: -x[1])[:limit])

但是当我尝试limit = 3时，我得到了

{'apple':5, 'pears':4, 'orange':3}

但是它不应该包含橙色：3 因为橙色和猕猴桃具有相同的优先级，如果我们包含猕猴桃和橙色，它将超出限制，因此不应同时包含两者。我应该回来

{'apple':5, 'pears':4}

Answer 1

方法是使用collections.Counter和most_common(n)。然后，您可以根据需要再添加一个并继续弹出，直到最后一个值更改为止：

from collections import Counter

dct = {'apple':5, 'pears':4, 'orange':3, 'kiwi':3, 'banana':1}
n = 3

items = Counter(dictation).most_common(n+1)
last_val = items[-1][1]
if len(items) > n:
    while items[-1][1] == last_val:
        items.pop()

new = dict(items)
# {'apple': 5, 'pears': 4}

Answer 2

这在计算上不是很好，但是可以。它创建一个Counter对象以获取数据的排序输出，并创建一个倒置的defaultdict来保存与分数匹配的列表-它同时使用一些数学运算来创建结果：

from collections import defaultdict, Counter

def gimme(d,n):
    c = Counter(d)
    grpd = defaultdict(list)
    for key,value in c.items():
        grpd[value].append(key)


    result = {}
    for key,value in c.most_common():
        if len(grpd[value])+len(result) <= n:
            result.update( {k:value for k in grpd[value] } )
        else:
            break
    return result

测试：

data = {'apple':5, 'pears':4, 'orange':3, 'kiwi':3, 'banana':1 }

for k in range(10):
    print(k, gimme(data,k))

输出：

0 {}
1 {'apple': 5}
2 {'apple': 5, 'pears': 4}
3 {'apple': 5, 'pears': 4}
4 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3}
5 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3}
6 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3, 'banana': 1}
7 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3, 'banana': 1}
8 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3, 'banana': 1}
9 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3, 'banana': 1}

Answer 3

请注意，默认情况下，按顶部 n 进行过滤不会排除所有超过指定上限的相等值。这是设计使然。

诀窍是考虑（n + 1）的最高值，并确保字典中的值均高于该数字：

from heapq import nlargest

dictation = {'apple':5, 'pears':4, 'orange':3, 'kiwi':3, 'banana':1}

n = 3
largest_items = nlargest(n+1, dictation.items(), key=lambda x: x[1])
n_plus_one_value = largest_items[-1][1]

res = {k: v for k, v in largest_items if v > n_plus_one_value}

print(res)

{'apple': 5, 'pears': 4}

我们在这里假设len(largest_items) < n，否则您可以将输入字典作为结果。

字典理解似乎很昂贵。对于较大的输入，可以使用bisect，例如：

from heapq import nlargest
from operator import itemgetter
from bisect import bisect

dictation = {'apple':5, 'pears':4, 'orange':3, 'kiwi':3, 'banana':1}

n = 3
largest_items = nlargest(n+1, dictation.items(), key=lambda x: x[1])
n_plus_one_value = largest_items[-1][1]

index = bisect(list(map(itemgetter(1), largest_items))[::-1], n_plus_one_value)

res = dict(largest_items[:len(largest_items) - index])

print(res)

{'apple': 5, 'pears': 4}

仅当字典键的值没有重复的次数时才选择

3 个答案: