Question

有人知道使用什么更好地考虑速度和资源吗？链接到一些可信赖的来源将非常感激。

if key not in dictionary.keys():

或

if not dictionary.get(key):

Answer 1

首先，你做

if key not in dictionary:

因为dicts是由键重复的。

其次，这两个陈述不相等 - 如果相应的值是假的（0，""，[]等），则第二个条件为真，不仅如果密钥不存在。

最后，第一种方法肯定更快，更pythonic。函数/方法调用很昂贵。如果您不确定，timeit。

Answer 2

根据我的经验，使用in比使用get更快，尽管get的速度可以通过缓存get方法来提高，因此它不会使用timeit。必须每次都要查找。以下是一些''' in vs get speed test Comparing the speed of cache retrieval / update using `get` vs using `in` http://stackoverflow.com/a/35451912/4014959 Written by PM 2Ring 2015.12.01 Updated for Python 3 2017.08.08 ''' from __future__ import print_function from timeit import Timer from random import randint import dis cache = {} def get_cache(x): ''' retrieve / update cache using `get` ''' res = cache.get(x) if res is None: res = cache[x] = x return res def get_cache_defarg(x, get=cache.get): ''' retrieve / update cache using defarg `get` ''' res = get(x) if res is None: res = cache[x] = x return res def in_cache(x): ''' retrieve / update cache using `in` ''' if x in cache: return cache[x] else: res = cache[x] = x return res #slow to fast. funcs = ( get_cache, get_cache_defarg, in_cache, ) def show_bytecode(): for func in funcs: fname = func.__name__ print('\n%s' % fname) dis.dis(func) def time_test(reps, loops): ''' Print timing stats for all the functions ''' for func in funcs: fname = func.__name__ print('\n%s: %s' % (fname, func.__doc__)) setup = 'from __main__ import data, ' + fname cmd = 'for v in data: %s(v)' % (fname,) times = [] t = Timer(cmd, setup) for i in range(reps): r = 0 for j in range(loops): r += t.timeit(1) cache.clear() times.append(r) times.sort() print(times) datasize = 1024 maxdata = 32 data = [randint(1, maxdata) for i in range(datasize)] #show_bytecode() time_test(3, 500)测试：

get_cache:  retrieve / update cache using `get` 
[0.65624237060546875, 0.68499755859375, 0.76354193687438965]

get_cache_defarg:  retrieve / update cache using defarg `get` 
[0.54204297065734863, 0.55032730102539062, 0.56702113151550293]

in_cache:  retrieve / update cache using `in` 
[0.48754477500915527, 0.49125504493713379, 0.50087881088256836]

运行Python 2.6.6的我的2Ghz机器上的

典型输出：

<?xml version="1.0" encoding="utf-8"?>
<Document xmlns="xxxx" xmlns:xsi="yyyy" xsi:schemaLocation="zzzz">
...

Answer 3

好的，我已经在python 3.4.3上测试了它，并且所有三种方法都在0.00001秒左右给出相同的结果。

import random
a = {}
for i in range(0, 1000000):
        a[str(random.random())] = random.random()
import time
t1 = time.time(); 1 in a.keys(); t2 = time.time(); print("Time=%s" % (t2 - t1))
t1 = time.time(); 1 in a; t2 = time.time(); print("Time=%s" % (t2 - t1))
t1 = time.time(); not a.get(1); t2 = time.time(); print("Time=%s" % (t2 - t1))

Answer 4

TLDR：使用if key not in dictionary。这是惯用的，强大的且快速的。

与此问题相关的版本有四个：问题中的两个版本以及它们的最佳变体：

key not in dictionary.keys()  # inA
key not in dictionary         # inB
not dictionary.get(key)       # getA
sentinel = object()
dictionary.get(key, sentinel) is not sentinel  # getB

两个A变体都有缺点，这意味着您不应该使用它们。 inA不需要在键上创建字典视图-这将添加一个间接步骤。 getA观察值的真实性 -这会导致诸如''或0之类的值出现错误的结果。

关于在inB上使用getB：两者都做相同的事情，即查看key是否有值。但是，getB 还返回该值或默认值，并且必须将其与前哨进行比较。因此，使用get的速度要慢得多：

$ PREPARE="
> import random
> data = {a: True for a in range(0, 512, 2)}
> sentinel=object()"
$ python3 -m perf timeit -s "$PREPARE" '27 in data'
.....................
Mean +- std dev: 33.9 ns +- 0.8 ns
$ python3 -m perf timeit -s "$PREPARE" 'data.get(27, sentinel) is not sentinel'
.....................
Mean +- std dev: 105 ns +- 5 ns

请注意，一旦JIT预热，pypy3的两种变体的性能几乎相同。

Python不在dict条件句中表现

4 个答案: