一时兴起,我最近用timeit
测试了这两种方法,看看哪种评估方法更快:
import timeit
"""Test method returns True if either argument is falsey, else False."""
def and_chk((a, b)):
if not (a and b):
return True
return False
def not_or_chk((a, b)):
if not a or not b:
return True
return False
......并得到了这些结果:
VALUES FOR a,b -> 0,0 0,1 1,0 1,1
method
and_chk(a,b) 0.95559 0.98646 0.95138 0.98788
not_or_chk(a,b) 0.96804 1.07323 0.96015 1.05874
...seconds per 1,111,111 cycles.
效率的差异在1%到9%之间,总是支持if not (a and b)
,这与我的预期相反,因为我理解if not a or not b
将评估其术语({{1然后if not a
)按顺序运行if not b
块,一旦遇到真正的表达式(并且没有if
子句)。相比之下,and
方法需要先评估两个子句,然后才能将任何结果返回给包装它的and_chk
。
if not..
条件?我完全清楚这种程度的微观优化实际上,如果不是完全的话,毫无意义。我只想了解Python的用途。
为了完成起见,这就是我设置if
...
timeit
...然后使用cyc = 1111111
bothFalse_and = iter([(0,0)] * cyc)
zeroTrue_and = iter([(1,0)] * cyc)
oneTrue_and = iter([(0,1)] * cyc)
bothTrue_and = iter([(1,1)] * cyc)
bothFalse_notor = iter([(0,0)] * cyc)
zeroTrue_notor = iter([(1,0)] * cyc)
oneTrue_notor = iter([(0,1)] * cyc)
bothTrue_notor = iter([(1,1)] * cyc)
time_bothFalse_and = timeit.Timer('and_chk(next(tups))', 'from __main__ import bothFalse_and as tups, and_chk')
time_zeroTrue_and = timeit.Timer('and_chk(next(tups))', 'from __main__ import zeroTrue_and as tups, and_chk')
time_oneTrue_and = timeit.Timer('and_chk(next(tups))', 'from __main__ import oneTrue_and as tups, and_chk')
time_bothTrue_and = timeit.Timer('and_chk(next(tups))', 'from __main__ import bothTrue_and as tups, and_chk')
time_bothFalse_notor = timeit.Timer('not_or_chk(next(tups))', 'from __main__ import bothFalse_notor as tups, not_or_chk')
time_zeroTrue_notor = timeit.Timer('not_or_chk(next(tups))', 'from __main__ import zeroTrue_notor as tups, not_or_chk')
time_oneTrue_notor = timeit.Timer('not_or_chk(next(tups))', 'from __main__ import oneTrue_notor as tups, not_or_chk')
time_bothTrue_notor = timeit.Timer('not_or_chk(next(tups))', 'from __main__ import bothTrue_notor as tups, not_or_chk')
运行每个timeit.Timer(..)
函数以获取结果。
答案 0 :(得分:27)
not_or_chk
函数需要在添加中对两次跳转进行两次一致操作(在最坏的情况下),而and_chk
函数只有两次跳转(同样,在最坏的情况)。
救援的dis module! dis
模块允许您查看代码的Python字节码反汇编。例如:
import dis
"""Test method returns True if either argument is falsey, else False."""
def and_chk((a, b)):
if not (a and b):
return True
return False
def not_or_chk((a, b)):
if not a or not b:
return True
return False
print("And Check:\n")
print(dis.dis(and_chk))
print("Or Check:\n")
print(dis.dis(not_or_chk))
生成此输出:
And Check:
5 0 LOAD_FAST 0 (.0)
3 UNPACK_SEQUENCE 2
6 STORE_FAST 1 (a)
9 STORE_FAST 2 (b)
6 12 LOAD_FAST 1 (a) * This block is the *
15 JUMP_IF_FALSE_OR_POP 21 * disassembly of *
18 LOAD_FAST 2 (b) * the "and_chk" *
>> 21 POP_JUMP_IF_TRUE 28 * function *
7 24 LOAD_GLOBAL 0 (True)
27 RETURN_VALUE
8 >> 28 LOAD_GLOBAL 1 (False)
31 RETURN_VALUE
None
Or Check:
10 0 LOAD_FAST 0 (.0)
3 UNPACK_SEQUENCE 2
6 STORE_FAST 1 (a)
9 STORE_FAST 2 (b)
11 12 LOAD_FAST 1 (a) * This block is the *
15 UNARY_NOT * disassembly of *
16 POP_JUMP_IF_TRUE 26 * the "not_or_chk" *
19 LOAD_FAST 2 (b) * function *
22 UNARY_NOT
23 POP_JUMP_IF_FALSE 30
12 >> 26 LOAD_GLOBAL 0 (True)
29 RETURN_VALUE
13 >> 30 LOAD_GLOBAL 1 (False)
33 RETURN_VALUE
None
看看我用星号标记的两个Python字节码块。这些块是你的两个反汇编函数。请注意,and_chk
只有两次跳转,函数中的计算是,同时决定是否跳转。
另一方面,not_or_chk
函数要求not
操作在最坏的情况下执行两次,另外到解释器决定是否要跳起来。
答案 1 :(得分:3)
我刚刚通过您的Meta SO问题注意到了这个问题:Is it appropriate to share the results of my research toward solving my own minor questions?。正如你在那个问题中提到的(这个问题上的一个标签表明),这类事情属于微优化的范畴。理想情况下,人们不应该担心这种微小的性能差异,正如Knuth所说,premature optimization is the root of all evil。但是我觉得调查这些问题很有趣也很有启发性,因为它可以让你更好地了解Python如何在“引擎盖下”工作。
Mephy's comment促使我看看if
的时间差异是什么 - 更少版本的功能。结果很有意思,恕我直言。我也借此机会简化了您的测试程序。
#!/usr/bin/env python
''' Do timeit tests on various implementations of NAND
NAND returns True if either argument is falsey, else False.
From https://stackoverflow.com/q/29551438/4014959
Written by PM 2Ring 2015.04.09
'''
from timeit import Timer
import dis
def and_chk(a, b):
return not (a and b)
def and_chk_if(a, b):
if not (a and b):
return True
else:
return False
def not_or_chk(a, b):
return not a or not b
def not_or_chk_if(a, b):
if not a or not b:
return True
else:
return False
#All the functions
funcs = (
and_chk,
and_chk_if,
not_or_chk,
not_or_chk_if,
)
#Argument tuples to test the functions with
bools = (0, 1)
bool_tups = [(u, v) for u in bools for v in bools]
def show_dis():
''' Show the disassembly for each function '''
print 'Disassembly'
for func in funcs:
fname = func.func_name
print '\n%s' % fname
dis.dis(func)
print
def verify():
''' Verify that the functions actually perform as intended '''
print 'Verifying...'
for func in funcs:
fname = func.func_name
print '\n%s' % fname
for args in bool_tups:
print args, func(*args)
print
def time_test(loops, reps):
''' Print timing stats for all the functions '''
print 'Timing tests\nLoops = %d, Repetitions = %d' % (loops, reps)
for func in funcs:
fname = func.func_name
print '\n%s' % fname
setup = 'from __main__ import %s' % fname
for args in bool_tups:
t = Timer('%s%s' % (fname, args), setup)
r = t.repeat(reps, loops)
r.sort()
print args, r
show_dis()
verify()
time_test(loops=520000, reps=3)
<强>输出强>
Disassembly
and_chk
13 0 LOAD_FAST 0 (a)
3 JUMP_IF_FALSE 4 (to 10)
6 POP_TOP
7 LOAD_FAST 1 (b)
>> 10 UNARY_NOT
11 RETURN_VALUE
and_chk_if
16 0 LOAD_FAST 0 (a)
3 JUMP_IF_FALSE 4 (to 10)
6 POP_TOP
7 LOAD_FAST 1 (b)
>> 10 JUMP_IF_TRUE 5 (to 18)
13 POP_TOP
17 14 LOAD_GLOBAL 0 (True)
17 RETURN_VALUE
>> 18 POP_TOP
19 19 LOAD_GLOBAL 1 (False)
22 RETURN_VALUE
23 LOAD_CONST 0 (None)
26 RETURN_VALUE
not_or_chk
22 0 LOAD_FAST 0 (a)
3 UNARY_NOT
4 JUMP_IF_TRUE 5 (to 12)
7 POP_TOP
8 LOAD_FAST 1 (b)
11 UNARY_NOT
>> 12 RETURN_VALUE
not_or_chk_if
25 0 LOAD_FAST 0 (a)
3 UNARY_NOT
4 JUMP_IF_TRUE 8 (to 15)
7 POP_TOP
8 LOAD_FAST 1 (b)
11 UNARY_NOT
12 JUMP_IF_FALSE 5 (to 20)
>> 15 POP_TOP
26 16 LOAD_GLOBAL 0 (True)
19 RETURN_VALUE
>> 20 POP_TOP
28 21 LOAD_GLOBAL 1 (False)
24 RETURN_VALUE
25 LOAD_CONST 0 (None)
28 RETURN_VALUE
Verifying...
and_chk
(0, 0) True
(0, 1) True
(1, 0) True
(1, 1) False
and_chk_if
(0, 0) True
(0, 1) True
(1, 0) True
(1, 1) False
not_or_chk
(0, 0) True
(0, 1) True
(1, 0) True
(1, 1) False
not_or_chk_if
(0, 0) True
(0, 1) True
(1, 0) True
(1, 1) False
Timing tests
Loops = 520000, Repetitions = 3
and_chk
(0, 0) [0.36773014068603516, 0.37793493270874023, 0.38387489318847656]
(0, 1) [0.36292791366577148, 0.37119913101196289, 0.37146902084350586]
(1, 0) [0.38673520088195801, 0.39340090751647949, 0.39670205116271973]
(1, 1) [0.38938498497009277, 0.39505791664123535, 0.40034103393554688]
and_chk_if
(0, 0) [0.4021449089050293, 0.40345501899719238, 0.41558098793029785]
(0, 1) [0.40686416625976562, 0.41213202476501465, 0.44800615310668945]
(1, 0) [0.4281308650970459, 0.42916202545166016, 0.43681907653808594]
(1, 1) [0.46246123313903809, 0.46759700775146484, 0.47544980049133301]
not_or_chk
(0, 0) [0.35435700416564941, 0.36368083953857422, 0.36867713928222656]
(0, 1) [0.35602092742919922, 0.35732793807983398, 0.36237406730651855]
(1, 0) [0.39550518989562988, 0.40660715103149414, 0.40977287292480469]
(1, 1) [0.4060060977935791, 0.4112389087677002, 0.41296815872192383]
not_or_chk_if
(0, 0) [0.4308779239654541, 0.44109201431274414, 0.45827698707580566]
(0, 1) [0.43600606918334961, 0.4370419979095459, 0.47623395919799805]
(1, 0) [0.48452401161193848, 0.48769593238830566, 0.49147915840148926]
(1, 1) [0.53107500076293945, 0.54048299789428711, 0.55434417724609375]
这些时序是在运行Mepis 11(Debian系列Linux发行版)的2GHz Pentium 4(单核32位)上使用Python 2.6.6执行的。
你会注意到我已经避免使用你的next(tups)
策略来获取每个函数调用的参数,而是直接传递args作为常量。执行next(tups)
所需的时间应该相当稳定,但最好在实际时消除此类开销,以便报告的时间测量更准确地反映我们的代码的性能真的对此感兴趣。
此外,通常会执行几次重复的定时循环并取最小值; FWIW,我一般做3至5次。来自timeit docs:
注意强>
计算结果的平均值和标准偏差很诱人 矢量并报告这些。但是,这不是很有用。在一个 典型情况下,最低值给出了你的速度的下限 机器可以运行给定的代码片段;结果中的值越高 向量通常不是由Python的速度变化引起的,而是 通过其他过程干扰您的计时准确性。所以min() 结果可能是你应该感兴趣的唯一数字。 在那之后,你应该看看整个矢量并应用常见的 感觉而不是统计。
你的Meta帖子说你要表演&amp;报告其他微优化实验,所以你可能有兴趣看看我几个月前发布的一些代码,它们对factorial函数的各种实现进行了时间测试。