Question

我有一个（可能很大的）字典和'可能'键列表。我想快速找到哪些键在字典中具有匹配值。我发现了很多关于单字典值here和here的讨论，但没有讨论速度或多个条目。

我提出了四种方法，对于最有效的三种方法，我将它们的速度与下面不同的样本大小进行比较 - 有更好的方法吗？如果人们可以提出合理的竞争者，我也会对他们进行以下分析。

示例列表和词典的创建如下：

import cProfile
from random import randint

length = 100000

listOfRandomInts = [randint(0,length*length/10-1) for x in range(length)]
dictionaryOfRandomInts = {randint(0,length*length/10-1): "It's here" for x in range(length)}

方法1：'in'关键字：

def way1(theList,theDict):
    resultsList = []
    for listItem in theList:
        if listItem in theDict:
            resultsList.append(theDict[listItem])
    return resultsList

cProfile.run('way1(listOfRandomInts,dictionaryOfRandomInts)')

在0.018秒内进行32次函数调用

方法2：错误处理：

def way2(theList,theDict):
    resultsList = []
    for listItem in theList:
        try:
            resultsList.append(theDict[listItem])
        except:
            ;
    return resultsList

cProfile.run('way2(listOfRandomInts,dictionaryOfRandomInts)')

在0.087秒内进行32次函数调用

方法3：设置交集：

def way3(theList,theDict):
    return list(set(theList).intersection(set(theDict.keys())))

cProfile.run('way3(listOfRandomInts,dictionaryOfRandomInts)')

在0.046秒内完成26次函数调用

方法4：天真地使用dict.keys()：

这是一个警示故事 - 这是我的第一次尝试， BY FAR 最慢！

def way4(theList,theDict):
    resultsList = []
    keys = theDict.keys()
    for listItem in theList:
        if listItem in keys:
            resultsList.append(theDict[listItem])
    return resultsList

cProfile.run('way4(listOfRandomInts,dictionaryOfRandomInts)')

12个函数调用248.552秒

编辑：将答案中给出的建议带入我用于一致性的相同框架中。许多人已经注意到Python 3.x可以实现更多的性能提升，特别是基于列表理解的方法。非常感谢所有的帮助！

方法5：更好的交叉方式（感谢jonrsharpe）：

def way5(theList, theDict):
    return = list(set(theList).intersection(theDict))

25个函数调用0.037秒

方法6：列表理解（感谢jonrsharpe）：

def way6(theList, theDict):
    return [item for item in theList if item in theDict]

在0.020秒内完成24次函数调用

方法7：使用&关键字（感谢jonrsharpe）：

def way7(theList, theDict):
    return list(theDict.viewkeys() & theList)

25个函数调用0.026秒

对于方法1-3和5-7，我使用长度为1000,10000,100000,1000000,10000000和100000000的列表/字典对它们进行如上定时，并显示所用时间的对数 - 对数图。在所有长度上，交集和语句中的方法表现更好。梯度大约都是1（可能更高一些），表示O（n）或者可能略微超线性缩放。

Log-Log plot comparing time-scaling of the 6 sensible methods with list/dict length

Answer 1

在我尝试的其他几种方法中，最快的是一个简单的列表理解：

def way6(theList, theDict):
    return [item for item in theList if item in theDict]

这与您最快的方法way1运行相同的过程，但速度更快。相比之下，最快set的方式是

def way5(theList, theDict):
    return list(set(theList).intersection(theDict))

timeit结果：

>>> import timeit
>>> setup = """from __main__ import way1, way5, way6
from random import randint
length = 100000
listOfRandomInts = [randint(0,length*length/10-1) for x in range(length)]
dictionaryOfRandomInts = {randint(0,length*length/10-1): "It's here" for x in range(length)}
"""
>>> timeit.timeit('way1(listOfRandomInts,dictionaryOfRandomInts)', setup=setup, number=1000)
14.550477756582723
>>> timeit.timeit('way5(listOfRandomInts,dictionaryOfRandomInts)', setup=setup, number=1000)
19.597916393388232
>>> timeit.timeit('way6(listOfRandomInts,dictionaryOfRandomInts)', setup=setup, number=1000)
13.652289059326904

添加了@ abarnert的建议：

def way7(theList, theDict):
    return list(theDict.viewkeys() & theList)

并重新运行我现在获得的时间：

>>> timeit.timeit('way1(listOfRandomInts,dictionaryOfRandomInts)', setup=setup, number=1000)
13.110055883138497
>>> timeit.timeit('way5(listOfRandomInts,dictionaryOfRandomInts)', setup=setup, number=1000)
17.292466681101036
>>> timeit.timeit('way6(listOfRandomInts,dictionaryOfRandomInts)', setup=setup, number=1000)
14.351759544463917
>>> timeit.timeit('way7(listOfRandomInts,dictionaryOfRandomInts)', setup=setup, number=1000)
17.206370930653392

way1和way6已切换位置，所以我再次重新开始：

>>> timeit.timeit('way1(listOfRandomInts,dictionaryOfRandomInts)', setup=setup, number=1000)
13.648176054011941
>>> timeit.timeit('way6(listOfRandomInts,dictionaryOfRandomInts)', setup=setup, number=1000)
13.847062579316628

所以看起来set方法比列表慢，但列表和列表理解之间的区别（令人惊讶的是，至少对我来说）有点变化。我只想选一个，不要担心它，除非它后来成为一个真正的瓶颈。

Answer 2

首先，我认为你是2.7，所以我会用2.7做大部分。但值得注意的是，如果您真的对优化代码感兴趣，那么3.x分支将继续提高性能，而2.x分支永远不会。为什么你使用CPython而不是PyPy？

无论如何，还需要进行一些微观优化（除了jonrsharpe's answer中的那些：

在局部变量中缓存属性和/或全局查找（由于某种原因，它被称为LOAD_FAST）。例如：

def way1a(theList, theDict):
    resultsList = []
    rlappend = resultsList.append
    for listItem in theList:
        if listItem in theDict:
            rlappend(theDict[listItem])
    return resultsList

In [10]: %timeit way1(listOfRandomInts, dictionaryOfRandomInts)
100 loops, best of 3: 13.2 ms per loop
In [11]: %timeit way1a(listOfRandomInts, dictionaryOfRandomInts)
100 loops, best of 3: 12.4 ms per loop

但是对于某些运算符特殊方法，例如__contains__和__getitem__，可能不值得这样做。当然，在你尝试之前你不会知道：

def way1b(theList, theDict):
    resultsList = []
    rlappend = resultsList.append
    tdin = theDict.__contains__
    tdgi = theDict.__getitem__
    for listItem in theList:
        if tdin(listItem):
            rlappend(tdgi(listItem))
    return resultsList

In [14]: %timeit way1b(listOfRandomInts, dictionaryOfRandomInts)
100 loops, best of 3: 12.8 ms per loop

与此同时，Jon的way6答案已经完全通过使用listcomp来优化resultList.append，我们只是看到优化他所做的查找可能无济于事。特别是在3.x中，理解将被编译成它自己的函数，但即使在2.7中我也不会期望任何好处，原因与显式循环中的原因相同。但是，让我们试着确定一下：

def way6(theList, theDict):
    return [theDict[item] for item in theList if item in theDict]
def way6a(theList, theDict):
    tdin = theDict.__contains__
    tdgi = theDict.__getitem__
    return [tdgi(item) for item in theList if tdin(item)]

In [31]: %timeit way6(listOfRandomInts, dictionaryOfRandomInts)
100 loops, best of 3: 14.7 ms per loop
In [32]: %timeit way6a(listOfRandomInts, dictionaryOfRandomInts)
100 loops, best of 3: 13.9 ms per loop

令人惊讶的是（至少对我而言），这次它实际上有所帮助。不知道为什么。

但我真正想要的是：将过滤器表达式和值表达式转换为函数调用的另一个好处是我们可以使用filter和map：

def way6b(theList, theDict):
    tdin = theDict.__contains__
    tdgi = theDict.__getitem__
    return map(tdgi, filter(tdin, theList))
def way6c(theList, theDict):
    tdin = theDict.__contains__
    tdgi = theDict.__getitem__
    return map(tdgi, ifilter(tdin, theList))

In [34]: %timeit way6b(listOfRandomInts, dictionaryOfRandomInts)
100 loops, best of 3: 10.7 ms per loop
In [35]: %timeit way6c(listOfRandomInts, dictionaryOfRandomInts)
100 loops, best of 3: 13 ms per loop

但这种收益主要是2.x特定的; 3.x具有更快的理解力，而其list(map(filter(…)))慢于2.x的map(filter(…))或map(ifilter(…))。

您不需要将集合交叉点的两侧转换为集合，只需将左侧转换为集合。右侧可以是任何可迭代的，并且dict已经是其密钥的可迭代。

但是，更好的是，dict的关键视图（3.x中的dict.keys，2.7中的dict.keyview）已经是一个类似于set的对象，并且由dict的哈希表支持，所以你不需要转换任何。（它没有完全相同的接口 - 它没有intersection方法，但它的&运算符采用迭代，不像set，它有intersection方法，可以采用迭代但它的&只接受套装。这很烦人，但我们只关心这里的表现，对吗？）

def way3(theList,theDict):
    return list(set(theList).intersection(set(theDict.keys())))
def way3a(theList,theDict):
    return list(set(theList).intersection(theDict))
def way3b(theList,theDict):
    return list(theDict.viewkeys() & theList)

In [20]: %timeit way3(listOfRandomInts, dictionaryOfRandomInts)
100 loops, best of 3: 23.7 ms per loop
In [20]: %timeit way3a(listOfRandomInts, dictionaryOfRandomInts)
100 loops, best of 3: 15.5 ms per loop
In [20]: %timeit way3b(listOfRandomInts, dictionaryOfRandomInts)
100 loops, best of 3: 15.7 ms per loop

最后一个没有帮助（尽管使用Python 3.4而不是2.7，它的速度提高了10％......），但第一个确实没有。

在现实生活中，你可能还想比较两个集合的大小来决定哪个集合得到满足，但是这里的信息是静态的，所以编写代码来测试它是没有意义的。

无论如何，我最快的成绩是2.7的map(filter(…))，相当不错。在3.4（我没有在这里展示），Jon的listcomp是最快的（甚至修复为返回值而不是键），并且比任何2.7方法都快。此外，3.4的最快设置操作（使用键视图作为集合，列表作为可迭代）比迭代方法更接近于2.7。

Answer 3

$ ipython2 # Apple CPython 2.7.6
[snip]
In [3]: %timeit way1(listOfRandomInts, dictionaryOfRandomInts)
100 loops, best of 3: 13.8 ms per loop

$ python27x -m ipython # custom-built 2.7.9
[snip]
In [3]: %timeit way1(listOfRandomInts, dictionaryOfRandomInts)
100 loops, best of 3: 13.7 ms per loop

$ ipython3 # python.org CPython 3.4.1
[snip]
In [3]: %timeit way1(listOfRandomInts, dictionaryOfRandomInts)
100 loops, best of 3: 12.8 ms per loop

所以，仅仅通过使用后来的Python就可以实现8％的加速。（而且listcomp和dict-key-view版本的加速比率接近20％。）并不是因为Apple的2.7是坏的或者其他什么，只是3.x在过去的5年里继续得到优化，虽然2.7没有（而且永远不会再来）。

同时：

$ ipython_pypy # PyPy 2.5.0 Python 2.7.8
[snip]
In [3]: %timeit way1(listOfRandomInts, dictionaryOfRandomInts)
1000000000 loops, best of 3: 1.97 ns per loop

只需键入5个额外字符即可获得7000000倍的加速。：）

我确定在这里作弊。要么JIT隐含地记住了结果，要么注意到我甚至没有查看结果并将其推到链上并意识到它不需要执行任何步骤或其他任何操作。但这有时会发生在现实生活中;我有一大堆代码，花了3天时间调试并尝试优化，然后才意识到它所做的一切都是不必要的......

无论如何，即使不能作弊，PyPy也会提供大约10倍的加速。而且它比调整属性查找或逆转谁变成5％的集合的顺序容易得多。

Jython更难以预测 - 有时几乎和PyPy一样快，有时比CPython慢得多。不幸的是，timeit在Jython 2.5.3中被破坏了，我刚从rc2升级到rc3就完全破坏了我的Jython 2.7，所以......今天没有测试。同样，IronPython基本上是在不同的VM上重做Jython;它通常更快，但再次无法预测。但我当前版本的Mono和我当前版本的IronPython并没有很好地协同工作，所以也没有测试。

快速查找给定列表中字典中的所有键

3 个答案: