在对应用程序进行基准测试时,我注意到在Python中按索引访问数组项的成本相对较高,使得for v in lst: v
比for i in range(len(lst): lst[i]
快得多:
from array import array
a_ = array('f', range(1000))
def f1():
a = a_
acc = 0
for v in a:
acc += v
return acc
def f2():
a = a_
acc = 0
for i in range(len(a)):
acc += a[i]
return acc
from dis import dis
from timeit import timeit
for f in f1,f2:
dis(f)
print(timeit(f, number=20000))
print()
制作:
9 0 LOAD_GLOBAL 0 (a_)
3 STORE_FAST 0 (a)
10 6 LOAD_CONST 1 (0)
9 STORE_FAST 1 (acc)
11 12 SETUP_LOOP 24 (to 39)
15 LOAD_FAST 0 (a)
18 GET_ITER
>> 19 FOR_ITER 16 (to 38)
22 STORE_FAST 2 (v)
12 25 LOAD_FAST 1 (acc)
28 LOAD_FAST 2 (v)
31 INPLACE_ADD
32 STORE_FAST 1 (acc)
35 JUMP_ABSOLUTE 19
>> 38 POP_BLOCK
14 >> 39 LOAD_FAST 1 (acc)
42 RETURN_VALUE
0.6036834940023255
17 0 LOAD_GLOBAL 0 (a_)
3 STORE_FAST 0 (a)
18 6 LOAD_CONST 1 (0)
9 STORE_FAST 1 (acc)
19 12 SETUP_LOOP 40 (to 55)
15 LOAD_GLOBAL 1 (range)
18 LOAD_GLOBAL 2 (len)
21 LOAD_FAST 0 (a)
24 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
27 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
30 GET_ITER
>> 31 FOR_ITER 20 (to 54)
34 STORE_FAST 2 (i)
20 37 LOAD_FAST 1 (acc)
40 LOAD_FAST 0 (a)
43 LOAD_FAST 2 (i)
46 BINARY_SUBSCR
47 INPLACE_ADD
48 STORE_FAST 1 (acc)
51 JUMP_ABSOLUTE 31
>> 54 POP_BLOCK
22 >> 55 LOAD_FAST 1 (acc)
58 RETURN_VALUE
1.0093544629999087
使用索引访问时,循环的核心仅在存在额外的LOAD_FAST
BINARY_SUBSCR
操作码时有所不同。但是,这足以使基于迭代器的解决方案比使用索引访问快40%。
不幸的是,以这种形式,迭代器仅可用于读取输入数组。有没有一种方法可以使用“快速”迭代器更改数组中的 项,还是必须坚持“慢速”索引访问?
答案 0 :(得分:2)
对于完整循环,您可以使用enumerate
分割差异,使用索引访问来设置值,并使用名称来读取它:
for i, value in enumerate(mysequence):
mysequence[i] = do_stuff_with(value)
但是,您不能避免在常规循环结构中进行索引重新分配; Python没有等效于C ++参考语义的方法,在赋值方法中,赋值会更改所引用的值,而不是重新绑定名称。
也就是说,如果工作很简单,那么list
的理解就可以通过建立新的list
并批量替换旧的索引来避免使用索引:
mysequence[:] = [do_stuff_with(value) for value in mysequence]
分配给mysequence
的整个片段可确保对其进行适当的修改,因此对其的其他引用也会看到更改。如果您不希望这种行为,可以不使用[:]
(您将重新绑定到新的list
,而没有其他引用)。
答案 1 :(得分:1)
以下是不同方法的一些时间结果:
#for v in a #for i in range(len(a)) #for i,v in enumerate(a)
[[0.47590930000296794, 0.8639191000038409, 0.7616558000008808],
[0.43640120000054594, 0.832395199999155, 0.7896779000002425],
[0.44416509999427944, 0.8366088000038872, 0.7590674000020954]]
请注意,使用numpy数组非常快,但前提是您是在numpy中构建数组并仅使用本机numpy函数:
import numpy as np
def f4():
N = 1000
vect = np.arange(float(N))
return np.sum(vect)
timeit给出:
[0.09995190000336152
0.10408379999716999
0.09926139999879524]
尝试通过显式索引修改numpy数组似乎没有任何好处。同样,将任何本地Python结构复制到numpy数组中也很昂贵。
答案 2 :(得分:0)
作为@ShadowRunner答案的补充,我做了一些额外的基准测试,以比较不同的解决方案来修改数组。
即使对于相对较大的数组,从理解列表构建新数组,然后覆盖原始数组,而不是尝试依次修改数组的各个项目,也更便宜。
import itertools
from functools import reduce
import operator
from array import array
a_ = array('f', range(100000))
b_ = array('f', range(100000))
def f1():
a = a_
b = b_
for i,v in enumerate(a):
b[i] = v*2
def f2():
a = a_
b = b_
for i in range(len(a)):
b[i] = a[i]*2
def f3():
a = a_
b = b_
getter = a.__getitem__
setter = b.__setitem__
for i in range(len(a)):
setter(i, getter(i)*2)
def f4():
a = a_
b = b_
b[:] = array('f', [v*2 for v in a])
from dis import dis
from timeit import timeit
for f in f1,f2,f3,f4:
dis(f)
print(timeit(f, number=2000))
print()
结果:
10 0 LOAD_GLOBAL 0 (a_)
3 STORE_FAST 0 (a)
11 6 LOAD_GLOBAL 1 (b_)
9 STORE_FAST 1 (b)
12 12 SETUP_LOOP 40 (to 55)
15 LOAD_GLOBAL 2 (enumerate)
18 LOAD_FAST 0 (a)
21 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
24 GET_ITER
>> 25 FOR_ITER 26 (to 54)
28 UNPACK_SEQUENCE 2
31 STORE_FAST 2 (i)
34 STORE_FAST 3 (v)
13 37 LOAD_FAST 3 (v)
40 LOAD_CONST 1 (2)
43 BINARY_MULTIPLY
44 LOAD_FAST 1 (b)
47 LOAD_FAST 2 (i)
50 STORE_SUBSCR
51 JUMP_ABSOLUTE 25
>> 54 POP_BLOCK
>> 55 LOAD_CONST 0 (None)
58 RETURN_VALUE
24.717656177999743
16 0 LOAD_GLOBAL 0 (a_)
3 STORE_FAST 0 (a)
17 6 LOAD_GLOBAL 1 (b_)
9 STORE_FAST 1 (b)
18 12 SETUP_LOOP 44 (to 59)
15 LOAD_GLOBAL 2 (range)
18 LOAD_GLOBAL 3 (len)
21 LOAD_FAST 0 (a)
24 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
27 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
30 GET_ITER
>> 31 FOR_ITER 24 (to 58)
34 STORE_FAST 2 (i)
19 37 LOAD_FAST 0 (a)
40 LOAD_FAST 2 (i)
43 BINARY_SUBSCR
44 LOAD_CONST 1 (2)
47 BINARY_MULTIPLY
48 LOAD_FAST 1 (b)
51 LOAD_FAST 2 (i)
54 STORE_SUBSCR
55 JUMP_ABSOLUTE 31
>> 58 POP_BLOCK
>> 59 LOAD_CONST 0 (None)
62 RETURN_VALUE
25.86994492100348
22 0 LOAD_GLOBAL 0 (a_)
3 STORE_FAST 0 (a)
23 6 LOAD_GLOBAL 1 (b_)
9 STORE_FAST 1 (b)
24 12 LOAD_FAST 0 (a)
15 LOAD_ATTR 2 (__getitem__)
18 STORE_FAST 2 (getter)
25 21 LOAD_FAST 1 (b)
24 LOAD_ATTR 3 (__setitem__)
27 STORE_FAST 3 (setter)
26 30 SETUP_LOOP 49 (to 82)
33 LOAD_GLOBAL 4 (range)
36 LOAD_GLOBAL 5 (len)
39 LOAD_FAST 0 (a)
42 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
45 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
48 GET_ITER
>> 49 FOR_ITER 29 (to 81)
52 STORE_FAST 4 (i)
27 55 LOAD_FAST 3 (setter)
58 LOAD_FAST 4 (i)
61 LOAD_FAST 2 (getter)
64 LOAD_FAST 4 (i)
67 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
70 LOAD_CONST 1 (2)
73 BINARY_MULTIPLY
74 CALL_FUNCTION 2 (2 positional, 0 keyword pair)
77 POP_TOP
78 JUMP_ABSOLUTE 49
>> 81 POP_BLOCK
>> 82 LOAD_CONST 0 (None)
85 RETURN_VALUE
42.435717200998624
30 0 LOAD_GLOBAL 0 (a_)
3 STORE_FAST 0 (a)
31 6 LOAD_GLOBAL 1 (b_)
9 STORE_FAST 1 (b)
32 12 LOAD_GLOBAL 2 (array)
15 LOAD_CONST 1 ('f')
18 LOAD_CONST 2 (<code object <listcomp> at 0x7fab3b1e9ae0, file "t.py", line 32>)
21 LOAD_CONST 3 ('f4.<locals>.<listcomp>')
24 MAKE_FUNCTION 0
27 LOAD_FAST 0 (a)
30 GET_ITER
31 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
34 CALL_FUNCTION 2 (2 positional, 0 keyword pair)
37 LOAD_FAST 1 (b)
40 LOAD_CONST 0 (None)
43 LOAD_CONST 0 (None)
46 BUILD_SLICE 2
49 STORE_SUBSCR
50 LOAD_CONST 0 (None)
53 RETURN_VALUE
19.328940125000372