为什么这两个循环的速度完全相同?

时间:2017-04-20 00:31:52

标签: python loops optimization bitwise-operators

l = [1,0,0,1,1]

count = 0
start = time()
for _ in range(100000):
    for x in range(len(l)-1):
        for y in range(x+1, len(l)):
            if l[x] + l[y] == 1:
                count += 1
end = time()
count2 = 0
start2 = time()
for _ in range(100000):
    for x in range(len(l)-1):
        for y in range(x+1, len(l)):
            if l[x]^l[y]:
                count2 += 1
end2 = time()

print str(count) + ' : Add and compare took: ' + str((end - start)/100000)
print str(count2) + ' : Bitwise took: ' + str((end2 - start2)/100000)

根据我对按位操作的理解,它们应该比简单的比较更快。然而,这两个循环最终都是完全相同的速度(没有过分挑剔)。

当整数比较看起来同样快时,使用可以说更复杂的按位运算比整数比较有什么好处?

编辑:

看来,操作码并非都是平等的。

奥斯汀的回答提到两个操作之间的差异是3个操作码对比1,但是,以下示例具有相同数量的操作码,但性能明显不同:

i = j = 10

def test1():
    if i == j:
        print True

def test2():
    if not i-j:
        print True

print 'test 1'
start1 = time()
test1()
end1 = time()
dis(test1)
print 'test 2'
start2 = time()
test2()
end2 = time()
dis(test2)
print 'Test 1 took: ' + str(end1 - start1)
print 'Test 2 took: ' + str(end2 - start2)

这将输出:

test 1
True
 25           0 LOAD_GLOBAL              0 (i)
              3 LOAD_GLOBAL              1 (j)
              6 COMPARE_OP               2 (==)
              9 POP_JUMP_IF_FALSE       20

 26          12 LOAD_GLOBAL              2 (True)
             15 PRINT_ITEM          
             16 PRINT_NEWLINE       
             17 JUMP_FORWARD             0 (to 20)
        >>   20 LOAD_CONST               0 (None)
             23 RETURN_VALUE        
test 2
True
 29           0 LOAD_GLOBAL              0 (i)
              3 LOAD_GLOBAL              1 (j)
              6 BINARY_SUBTRACT     
              7 POP_JUMP_IF_TRUE        18

 30          10 LOAD_GLOBAL              2 (True)
             13 PRINT_ITEM          
             14 PRINT_NEWLINE       
             15 JUMP_FORWARD             0 (to 18)
        >>   18 LOAD_CONST               0 (None)
             21 RETURN_VALUE        
Test 1 took: 7.86781311035e-06
Test 2 took: 5.00679016113e-06

有更准确的衡量效率的方法吗?

编辑:

创建的操作码几乎相等。

修改代码以排除令人讨厌的I / O,这说明I / O存在问题的原因。

i = j = 10
bool1 = False
bool2 = False

def test1():
    if i == j:
        bool1 = True

def test2():
    if not i-j:
        bool2 = True

print 'test 1'
start1 = time()
for _ in range(1000000):
    test1()
end1 = time()
dis(test1)
print 'test 2'
start2 = time()
for _ in range(1000000):
    test2()
end2 = time()
dis(test2)
print str(bool1) + ' : Test 1 took: ' + str(end1 - start1)
print str(bool2) + ' : Test 2 took: ' + str(end2 - start2)

将打印:

test 1
 27           0 LOAD_GLOBAL              0 (i)
              3 LOAD_GLOBAL              1 (j)
              6 COMPARE_OP               2 (==)
              9 POP_JUMP_IF_FALSE       21

 28          12 LOAD_GLOBAL              2 (True)
             15 STORE_FAST               0 (bool1)
             18 JUMP_FORWARD             0 (to 21)
        >>   21 LOAD_CONST               0 (None)
             24 RETURN_VALUE        
test 2
 31           0 LOAD_GLOBAL              0 (i)
              3 LOAD_GLOBAL              1 (j)
              6 BINARY_SUBTRACT     
              7 POP_JUMP_IF_TRUE        19

 32          10 LOAD_GLOBAL              2 (True)
             13 STORE_FAST               0 (bool2)
             16 JUMP_FORWARD             0 (to 19)
        >>   19 LOAD_CONST               0 (None)
             22 RETURN_VALUE        
False : Test 1 took: 0.156816959381
False : Test 2 took: 0.16281914711

所以不是那么激烈,但仍然略有不同。这次运行约12次,测试1仅运行一次。

所以还有一些神秘感!只是没那么激烈。

1 个答案:

答案 0 :(得分:3)

循环中的所有代码基本相同,所以我删除了它们。相反,我将代码缩减为两个函数,并让我的好朋友dis.dis向我展示他们在做什么:

l = [1,0,0,1,1]

def f1():
    x = y = 0
    if l[x] + l[y] == 1:
        count += 1

def f2():
    x = y = 0
    if l[x]^l[y]:
        count2 += 1


import dis
print "F1"
dis.dis(f1)
print "F2"
dis.dis(f2)

这是输出:

$ python2.7 test.py
F1
  4           0 LOAD_CONST               1 (0)
              3 DUP_TOP
              4 STORE_FAST               0 (x)
              7 STORE_FAST               1 (y)

  5          10 LOAD_GLOBAL              0 (l)
             13 LOAD_FAST                0 (x)
             16 BINARY_SUBSCR
             17 LOAD_GLOBAL              0 (l)
             20 LOAD_FAST                1 (y)
             23 BINARY_SUBSCR
             24 BINARY_ADD                          ## Here.
             25 LOAD_CONST               2 (1)      ## Here.
             28 COMPARE_OP               2 (==)     ## Here.
             31 POP_JUMP_IF_FALSE       47

  6          34 LOAD_FAST                2 (count)
             37 LOAD_CONST               2 (1)
             40 INPLACE_ADD
             41 STORE_FAST               2 (count)
             44 JUMP_FORWARD             0 (to 47)
        >>   47 LOAD_CONST               0 (None)
             50 RETURN_VALUE
F2
  9           0 LOAD_CONST               1 (0)
              3 DUP_TOP
              4 STORE_FAST               0 (x)
              7 STORE_FAST               1 (y)

 10          10 LOAD_GLOBAL              0 (l)
             13 LOAD_FAST                0 (x)
             16 BINARY_SUBSCR
             17 LOAD_GLOBAL              0 (l)
             20 LOAD_FAST                1 (y)
             23 BINARY_SUBSCR
             24 BINARY_XOR                          ## Here.
             25 POP_JUMP_IF_FALSE       41

 11          28 LOAD_FAST                2 (count2)
             31 LOAD_CONST               2 (1)
             34 INPLACE_ADD
             35 STORE_FAST               2 (count2)
             38 JUMP_FORWARD             0 (to 41)
        >>   41 LOAD_CONST               0 (None)
             44 RETURN_VALUE

区别是3个操作码与1个操作码。这些操作的设置是6个操作码。差异在噪音中消失了。