Question

我有一个很大的字典对象，有几个键值对（大约16个），但我只对它们中的3个感兴趣。实现这一目标的最佳方式（最短/最有效/最优雅）是什么？

我所知道的最好的是：

bigdict = {'a':1,'b':2,....,'z':26} 
subdict = {'l':bigdict['l'], 'm':bigdict['m'], 'n':bigdict['n']}

我相信有比这更优雅的方式。想法？

Answer 1

你可以尝试：

dict((k, bigdict[k]) for k in ('l', 'm', 'n'))

...或 ~~Python 3~~ Python版本2.7或更高版本（感谢Fábio Diniz指出它也适用于2.7）：< / p>

{k: bigdict[k] for k in ('l', 'm', 'n')}

更新：正如Håvard S指出的那样，我假设您知道密钥将出现在字典中 - 如果您无法做出假设，请参阅his answer 。或者，正如timbo在评论中指出的那样，如果您希望bigdict中缺少的密钥映射到None，您可以执行以下操作：

{k: bigdict.get(k, None) for k in ('l', 'm', 'n')}

如果您正在使用Python 3，并且仅希望新dict中的键需要实际存在于原始dict中，则可以使用视图对象实现某些设置操作的事实： / p>

{k: bigdict[k] for k in bigdict.keys() & {'l', 'm', 'n'}}

Answer 2

有点短，至少：

wanted_keys = ['l', 'm', 'n'] # The keys you want
dict((k, bigdict[k]) for k in wanted_keys if k in bigdict)

Answer 3

interesting_keys = ('l', 'm', 'n')
subdict = {x: bigdict[x] for x in interesting_keys if x in bigdict}

Answer 4

所有提到的方法的速度比较：

Python 2.7.11 |Anaconda 2.4.1 (64-bit)| (default, Jan 29 2016, 14:26:21) [MSC v.1500 64 bit (AMD64)] on win32
In[2]: import numpy.random as nprnd
keys = nprnd.randint(1000, size=10000)
bigdict = dict([(_, nprnd.rand()) for _ in range(1000)])

%timeit {key:bigdict[key] for key in keys}
%timeit dict((key, bigdict[key]) for key in keys)
%timeit dict(map(lambda k: (k, bigdict[k]), keys))
%timeit dict(filter(lambda i:i[0] in keys, bigdict.items()))
%timeit {key:value for key, value in bigdict.items() if key in keys}
100 loops, best of 3: 3.09 ms per loop
100 loops, best of 3: 3.72 ms per loop
100 loops, best of 3: 6.63 ms per loop
10 loops, best of 3: 20.3 ms per loop
100 loops, best of 3: 20.6 ms per loop

正如预期的那样：字典理解是最好的选择。

Answer 5

此答案使用类似于所选答案的字典理解，但除了缺少的项目外不会。

python 2版本：

{k:v for k, v in bigDict.iteritems() if k in ('l', 'm', 'n')}

python 3版本：

{k:v for k, v in bigDict.items() if k in ('l', 'm', 'n')}

Answer 6

也许：

subdict=dict([(x,bigdict[x]) for x in ['l', 'm', 'n']])

Python 3甚至支持以下内容：

subdict={a:bigdict[a] for a in ['l','m','n']}

请注意，您可以按如下方式检查词典中是否存在：

subdict=dict([(x,bigdict[x]) for x in ['l', 'm', 'n'] if x in bigdict])

RESP。 for python 3

subdict={a:bigdict[a] for a in ['l','m','n'] if a in bigdict}

Answer 7

你也可以使用map（无论如何都是非常有用的函数）：

sd = dict(map(lambda k: (k, l.get(k, None)), l))

示例：

large_dictionary = {'a1':123, 'a2':45, 'a3':344} list_of_keys = ['a1', 'a3'] small_dictionary = dict(map(lambda key: (key, large_dictionary.get(key, None)), list_of_keys))

PS。我从之前的回答中借用了.get（key，None）：）

Answer 8

好的，这件事让我困扰了几次，所以谢谢Jayesh的提问。

上面的答案看起来像任何一个好的解决方案，但如果你在你的代码中使用它，那么包装功能恕我直言是有意义的。此外，这里有两种可能的用例：一种是您关心所有关键字是否都在原始字典中。还有一个你没有的地方。对两者平等对待会很好。

所以，对于我的两分钱，我建议写一个字典的子类，例如。

class my_dict(dict):
    def subdict(self, keywords, fragile=False):
        d = {}
        for k in keywords:
            try:
                d[k] = self[k]
            except KeyError:
                if fragile:
                    raise
        return d

现在您可以使用

orig_dict.subdict(keywords)

提取子词典

用法示例：

#
## our keywords are letters of the alphabet
keywords = 'abcdefghijklmnopqrstuvwxyz'
#
## our dictionary maps letters to their index
d = my_dict([(k,i) for i,k in enumerate(keywords)])
print('Original dictionary:\n%r\n\n' % (d,))
#
## constructing a sub-dictionary with good keywords
oddkeywords = keywords[::2]
subd = d.subdict(oddkeywords)
print('Dictionary from odd numbered keys:\n%r\n\n' % (subd,))
#
## constructing a sub-dictionary with mixture of good and bad keywords
somebadkeywords = keywords[1::2] + 'A'
try:
    subd2 = d.subdict(somebadkeywords)
    print("We shouldn't see this message")
except KeyError:
    print("subd2 construction fails:")
    print("\toriginal dictionary doesn't contain some keys\n\n")
#
## Trying again with fragile set to false
try:
    subd3 = d.subdict(somebadkeywords, fragile=False)
    print('Dictionary constructed using some bad keys:\n%r\n\n' % (subd3,))
except KeyError:
    print("We shouldn't see this message")

如果您运行以上所有代码，您应该看到（类似）以下输出（抱歉格式化）：

原文字典：
         {＆＃39; a＆＃39;：0，＆＃39; c＆＃39;：2，＆＃39;＆＃39;：1，＆＃39; e＆＃39;：4，＆＃39; d＆＃39;：3，＆＃39; g＆＃39;：6，＆＃39; f＆＃39;：5，           ＆＃39;我＆＃39;：8，＆＃39; h＆＃39;：7，＆＃39;＆＃39;：10，＆＃39; j＆＃39;：9，＆＃39; m＆＃ 39;：12，＆＃39; l＆＃39;：11，＆＃39; o＆＃39;：14，           ＆＃39; n＆＃39;：13，＆＃39;＆＃39 ;: 16，＆＃39;＆＃39;：15，＆＃39;＆＃39;：＆＃39;＆＃39; 39;：17，＆＃39; u＆＃39;：20，           ＆＃39;＆＃39;：19，＆＃39;＆＃39;：22，＆＃39; v＆＃39;：21，＆＃39; y＆＃39;：24，＆＃39; x＆＃ 39;：23，＆＃39; z＆＃39;：25}

奇数键字典：
         {＆＃39; a＆＃39;：0，＆＃39; c＆＃39;：2，＆＃39; e＆＃39;：4，＆＃39; g＆＃39;：6，＆＃39; i＆＃39;：8，＆＃39; k＆＃39;：10，＆＃39; m＆＃39;：12，＆＃39; o＆＃39;：14，＆＃39; q＆＃39;：16，＆＃39; s＆＃39;：18，＆＃39; u＆＃39 ;: 20，＆＃39;＆＃39;：22，＆＃39; y＆＃39;：24}

subd2构造失败：
         原始字典不包含一些键

使用一些坏键构造的词典：
         {＆＃39; b＆＃39;：1＆＃39; d＆＃39;：3，＆＃39; f＆＃39;：5，＆＃39; h＆＃39;：7，＆＃39; j＆＃39;：9，＆＃39; l＆＃39;：11，＆＃39; n＆＃39;：13，＆＃39; p＆＃39;：15，＆＃39; r＆＃39;：17，＆＃39; t＆＃39 ;: 19，＆＃39;＆＃39;：21，＆＃39; x＆＃39;：23，＆＃39; z＆＃39;：25}

Answer 9

解决方案

from operator import itemgetter
from typing import List, Dict, Union


def subdict(d: Union[Dict, List], columns: List[str]) -> Union[Dict, List[Dict]]:
    """Return a dict or list of dicts with subset of 
    columns from the d argument.
    """
    getter = itemgetter(*columns)

    if isinstance(d, list):
        result = []
        for subset in map(getter, d):
            record = dict(zip(columns, subset))
            result.append(record)
        return result
    elif isinstance(d, dict):
        return dict(zip(columns, getter(d)))

    raise ValueError('Unsupported type for `d`')

使用示例

# pure dict

d = dict(a=1, b=2, c=3)
print(subdict(d, ['a', 'c']))

>>> In [5]: {'a': 1, 'c': 3}

# list of dicts

d = [
    dict(a=1, b=2, c=3),
    dict(a=2, b=4, c=6),
    dict(a=4, b=8, c=12),
]

print(subdict(d, ['a', 'c']))

>>> In [5]: [{'a': 1, 'c': 3}, {'a': 2, 'c': 6}, {'a': 4, 'c': 12}]

Answer 10

还有一个（我更喜欢Mark Longair的回答）

di = {'a':1,'b':2,'c':3}
req = ['a','c','w']
dict([i for i in di.iteritems() if i[0] in di and i[0] in req])

Answer 11

您可以使用所需键的迭代器提取所需的对：

<?php

    for ($i=0; $i <= 10; $i++) {

        $fp = fopen('lidn.txt', 'w');

    $a = [$i];
    foreach ($a as $value) {
        fwrite($fp ,$i);


        echo "$value";

    fclose($fp);

}
}

?>

应该比以前提供的任何东西都要快。 ;）

Answer 12

使用地图（halfdanrump的答案）对我来说是最好的，尽管还没有计时...

但是如果您去看字典，并且有big_dict：

绝对确保您遍历要求。这很关键，并且会影响算法的运行时间（大O，theta，您为其命名）
写出足够通用的名称，以免在没有钥匙的情况下出错。

例如：

big_dict = {'a':1,'b':2,'c':3,................................................}
req = ['a','c','w']

{k:big_dict.get(k,None) for k in req )
# or 
{k:big_dict[k] for k in req if k in big_dict)

请注意，在相反的情况下，req很大，但是my_dict很小，您应该改为遍历my_dict。

通常，我们正在做一个交叉点和the complexity of the problem is O(min(len(dict)),min(len(req)))。 Python的own implementation of intersection考虑了这两个集合的大小，因此似乎是最佳的。而且，在c以及核心库的一部分中，可能比大多数未优化的python语句要快。因此，我考虑的解决方案是：

dict = {'a':1,'b':2,'c':3,................................................}
req = ['a','c','w',...................]

{k:dic[k] for k in set(req).intersection(dict.keys())}

它将关键操作移入python的c代码内，并且将在所有情况下都适用。

从Python字典对象中提取键值对的子集？

12 个答案: