我有一个很大的字典对象,有几个键值对(大约16个),但我只对它们中的3个感兴趣。实现这一目标的最佳方式(最短/最有效/最优雅)是什么?
我所知道的最好的是:
bigdict = {'a':1,'b':2,....,'z':26}
subdict = {'l':bigdict['l'], 'm':bigdict['m'], 'n':bigdict['n']}
我相信有比这更优雅的方式。想法?
答案 0 :(得分:350)
你可以尝试:
dict((k, bigdict[k]) for k in ('l', 'm', 'n'))
...或 Python 3 Python版本2.7或更高版本(感谢Fábio Diniz指出它也适用于2.7):< / p>
{k: bigdict[k] for k in ('l', 'm', 'n')}
更新:正如Håvard S指出的那样,我假设您知道密钥将出现在字典中 - 如果您无法做出假设,请参阅his answer 。或者,正如timbo在评论中指出的那样,如果您希望bigdict
中缺少的密钥映射到None
,您可以执行以下操作:
{k: bigdict.get(k, None) for k in ('l', 'm', 'n')}
如果您正在使用Python 3,并且仅希望新dict中的键需要实际存在于原始dict中,则可以使用视图对象实现某些设置操作的事实: / p>
{k: bigdict[k] for k in bigdict.keys() & {'l', 'm', 'n'}}
答案 1 :(得分:99)
有点短,至少:
wanted_keys = ['l', 'm', 'n'] # The keys you want
dict((k, bigdict[k]) for k in wanted_keys if k in bigdict)
答案 2 :(得分:20)
interesting_keys = ('l', 'm', 'n')
subdict = {x: bigdict[x] for x in interesting_keys if x in bigdict}
答案 3 :(得分:12)
所有提到的方法的速度比较:
Python 2.7.11 |Anaconda 2.4.1 (64-bit)| (default, Jan 29 2016, 14:26:21) [MSC v.1500 64 bit (AMD64)] on win32
In[2]: import numpy.random as nprnd
keys = nprnd.randint(1000, size=10000)
bigdict = dict([(_, nprnd.rand()) for _ in range(1000)])
%timeit {key:bigdict[key] for key in keys}
%timeit dict((key, bigdict[key]) for key in keys)
%timeit dict(map(lambda k: (k, bigdict[k]), keys))
%timeit dict(filter(lambda i:i[0] in keys, bigdict.items()))
%timeit {key:value for key, value in bigdict.items() if key in keys}
100 loops, best of 3: 3.09 ms per loop
100 loops, best of 3: 3.72 ms per loop
100 loops, best of 3: 6.63 ms per loop
10 loops, best of 3: 20.3 ms per loop
100 loops, best of 3: 20.6 ms per loop
正如预期的那样:字典理解是最好的选择。
答案 4 :(得分:11)
此答案使用类似于所选答案的字典理解,但除了缺少的项目外不会。
python 2版本:
{k:v for k, v in bigDict.iteritems() if k in ('l', 'm', 'n')}
python 3版本:
{k:v for k, v in bigDict.items() if k in ('l', 'm', 'n')}
答案 5 :(得分:4)
也许:
subdict=dict([(x,bigdict[x]) for x in ['l', 'm', 'n']])
Python 3甚至支持以下内容:
subdict={a:bigdict[a] for a in ['l','m','n']}
请注意,您可以按如下方式检查词典中是否存在:
subdict=dict([(x,bigdict[x]) for x in ['l', 'm', 'n'] if x in bigdict])
RESP。 for python 3
subdict={a:bigdict[a] for a in ['l','m','n'] if a in bigdict}
答案 6 :(得分:3)
你也可以使用map(无论如何都是非常有用的函数):
sd = dict(map(lambda k: (k, l.get(k, None)), l))
示例:
large_dictionary = {'a1':123, 'a2':45, 'a3':344}
list_of_keys = ['a1', 'a3']
small_dictionary = dict(map(lambda key: (key, large_dictionary.get(key, None)), list_of_keys))
PS。我从之前的回答中借用了.get(key,None):)
答案 7 :(得分:3)
好的,这件事让我困扰了几次,所以谢谢Jayesh的提问。
上面的答案看起来像任何一个好的解决方案,但如果你在你的代码中使用它,那么包装功能恕我直言是有意义的。此外,这里有两种可能的用例:一种是您关心所有关键字是否都在原始字典中。还有一个你没有的地方。对两者平等对待会很好。
所以,对于我的两分钱,我建议写一个字典的子类,例如。
class my_dict(dict):
def subdict(self, keywords, fragile=False):
d = {}
for k in keywords:
try:
d[k] = self[k]
except KeyError:
if fragile:
raise
return d
现在您可以使用
orig_dict.subdict(keywords)
提取子词典
用法示例:
#
## our keywords are letters of the alphabet
keywords = 'abcdefghijklmnopqrstuvwxyz'
#
## our dictionary maps letters to their index
d = my_dict([(k,i) for i,k in enumerate(keywords)])
print('Original dictionary:\n%r\n\n' % (d,))
#
## constructing a sub-dictionary with good keywords
oddkeywords = keywords[::2]
subd = d.subdict(oddkeywords)
print('Dictionary from odd numbered keys:\n%r\n\n' % (subd,))
#
## constructing a sub-dictionary with mixture of good and bad keywords
somebadkeywords = keywords[1::2] + 'A'
try:
subd2 = d.subdict(somebadkeywords)
print("We shouldn't see this message")
except KeyError:
print("subd2 construction fails:")
print("\toriginal dictionary doesn't contain some keys\n\n")
#
## Trying again with fragile set to false
try:
subd3 = d.subdict(somebadkeywords, fragile=False)
print('Dictionary constructed using some bad keys:\n%r\n\n' % (subd3,))
except KeyError:
print("We shouldn't see this message")
如果您运行以上所有代码,您应该看到(类似)以下输出(抱歉格式化):
原文字典:
{&#39; a&#39;:0,&#39; c&#39;:2,&#39;&#39;:1,&#39; e&#39;:4,&#39; d& #39;:3,&#39; g&#39;:6,&#39; f&#39;:5, &#39;我&#39;:8,&#39; h&#39;:7,&#39;&#39;:10,&#39; j&#39;:9,&#39; m&# 39;:12,&#39; l&#39;:11,&#39; o&#39;:14, &#39; n&#39;:13,&#39;&#39 ;: 16,&#39;&#39;:15,&#39;&#39;:&#39;&#39; 39;:17,&#39; u&#39;:20, &#39;&#39;:19,&#39;&#39;:22,&#39; v&#39;:21,&#39; y&#39;:24,&#39; x&# 39;:23,&#39; z&#39;:25}奇数键字典:
{&#39; a&#39;:0,&#39; c&#39;:2,&#39; e&#39;:4,&#39; g&#39;:6,&#39; i& #39;:8,&#39; k&#39;:10,&#39; m&#39;:12,&#39; o&#39;:14,&#39; q&#39;:16, &#39; s&#39;:18,&#39; u&#39 ;: 20,&#39;&#39;:22,&#39; y&#39;:24}subd2构造失败:
原始字典不包含一些键使用一些坏键构造的词典:
{&#39; b&#39;:1&#39; d&#39;:3,&#39; f&#39;:5,&#39; h&#39;:7,&#39; j& #39;:9,&#39; l&#39;:11,&#39; n&#39;:13,&#39; p&#39;:15,&#39; r&#39;:17, &#39; t&#39 ;: 19,&#39;&#39;:21,&#39; x&#39;:23,&#39; z&#39;:25}
答案 8 :(得分:2)
解决方案
from operator import itemgetter
from typing import List, Dict, Union
def subdict(d: Union[Dict, List], columns: List[str]) -> Union[Dict, List[Dict]]:
"""Return a dict or list of dicts with subset of
columns from the d argument.
"""
getter = itemgetter(*columns)
if isinstance(d, list):
result = []
for subset in map(getter, d):
record = dict(zip(columns, subset))
result.append(record)
return result
elif isinstance(d, dict):
return dict(zip(columns, getter(d)))
raise ValueError('Unsupported type for `d`')
使用示例
# pure dict
d = dict(a=1, b=2, c=3)
print(subdict(d, ['a', 'c']))
>>> In [5]: {'a': 1, 'c': 3}
# list of dicts
d = [
dict(a=1, b=2, c=3),
dict(a=2, b=4, c=6),
dict(a=4, b=8, c=12),
]
print(subdict(d, ['a', 'c']))
>>> In [5]: [{'a': 1, 'c': 3}, {'a': 2, 'c': 6}, {'a': 4, 'c': 12}]
答案 9 :(得分:1)
还有一个(我更喜欢Mark Longair的回答)
di = {'a':1,'b':2,'c':3}
req = ['a','c','w']
dict([i for i in di.iteritems() if i[0] in di and i[0] in req])
答案 10 :(得分:0)
您可以使用所需键的迭代器提取所需的对:
<?php
for ($i=0; $i <= 10; $i++) {
$fp = fopen('lidn.txt', 'w');
$a = [$i];
foreach ($a as $value) {
fwrite($fp ,$i);
echo "$value";
fclose($fp);
}
}
?>
应该比以前提供的任何东西都要快。 ;)
答案 11 :(得分:0)
使用地图(halfdanrump的答案)对我来说是最好的,尽管还没有计时...
但是如果您去看字典,并且有big_dict:
例如:
big_dict = {'a':1,'b':2,'c':3,................................................}
req = ['a','c','w']
{k:big_dict.get(k,None) for k in req )
# or
{k:big_dict[k] for k in req if k in big_dict)
请注意,在相反的情况下,req很大,但是my_dict很小,您应该改为遍历my_dict。
通常,我们正在做一个交叉点和the complexity of the problem is O(min(len(dict)),min(len(req)))。 Python的own implementation of intersection考虑了这两个集合的大小,因此似乎是最佳的。而且,在c以及核心库的一部分中,可能比大多数未优化的python语句要快。 因此,我考虑的解决方案是:
dict = {'a':1,'b':2,'c':3,................................................}
req = ['a','c','w',...................]
{k:dic[k] for k in set(req).intersection(dict.keys())}
它将关键操作移入python的c代码内,并且将在所有情况下都适用。