情况:考虑以下两段Python代码: -
代码1:
for root, dirs, files in os.walk(top):
for f in files:
path = os.path.join(root, f)
print(path)
代码2:
for root, dirs, files in os.walk(top):
for f in files:
print(os.path,join(root,f))
问题:如果我没有将文件路径声明为变量,那么在性能或速度方面是否会有任何差异(假设我只使用一次 - 如果多次使用声明变量更有意义)
答案 0 :(得分:1)
除了使用timeit
进行简单的基准测试外,您还可以pytest-benchmark
,这使得创建比较变得非常简单:
import os
def f1(top):
for root, dirs, files in os.walk(top):
for f in files:
path = os.path.join(root, f)
print(path)
def f2(top):
for root, dirs, files in os.walk(top):
for f in files:
print(os.path.join(root, f))
def test_f1(benchmark):
benchmark(f1, '~/tmp')
def test_f2(benchmark):
benchmark(f2, '~/tmp')
注意:~/tmp
包含350个文件/文件夹,YMMV。运行
python -m pytest test.py --benchmark-min-time=0.001 --benchmark-histogram=hist
为您提供精彩的数据和直方图:
----------------------------------------------------------------------- benchmark: 2 tests ----------------------------------------------------------------------
Name (time in us) Min Max Mean StdDev Median IQR Outliers(*) Rounds Iterations
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
test_f1 4.4811 (1.0) 8.6253 (1.0) 4.7941 (1.00) 0.3531 (1.0) 4.7141 (1.01) 0.2762 (1.31) 15;7 216 1000
test_f2 4.4967 (1.00) 9.3009 (1.08) 4.7773 (1.0) 0.5242 (1.48) 4.6838 (1.0) 0.2113 (1.0) 6;13 215 1000
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
正如您所看到的,考虑到高差异,差异并不显着。
现在,如果您仍然好奇,可以使用dis
来显示CPython正在执行的字节码。这是CPython解释器的一个功能,它是运行python代码的最常用方法:
In [1]: import os, dis
In [2]: def f1(top):
...: for root, dirs, files in os.walk(top):
...: for f in files:
...: path = os.path.join(root, f)
...: print(path)
...:
In [3]: def f2(top):
...: for root, dirs, files, in os.walk(top):
...: for f in files:
...: print(os.path.join(root, f))
...:
In [4]: dis.dis(f1)
2 0 SETUP_LOOP 60 (to 62)
2 LOAD_GLOBAL 0 (os)
4 LOAD_ATTR 1 (walk)
6 LOAD_FAST 0 (top)
8 CALL_FUNCTION 1
10 GET_ITER
>> 12 FOR_ITER 46 (to 60)
14 UNPACK_SEQUENCE 3
16 STORE_FAST 1 (root)
18 STORE_FAST 2 (dirs)
20 STORE_FAST 3 (files)
3 22 SETUP_LOOP 34 (to 58)
24 LOAD_FAST 3 (files)
26 GET_ITER
>> 28 FOR_ITER 26 (to 56)
30 STORE_FAST 4 (f)
4 32 LOAD_GLOBAL 0 (os)
34 LOAD_ATTR 2 (path)
36 LOAD_ATTR 3 (join)
38 LOAD_FAST 1 (root)
40 LOAD_FAST 4 (f)
42 CALL_FUNCTION 2
44 STORE_FAST 5 (path)
5 46 LOAD_GLOBAL 4 (print)
48 LOAD_FAST 5 (path)
50 CALL_FUNCTION 1
52 POP_TOP
54 JUMP_ABSOLUTE 28
>> 56 POP_BLOCK
>> 58 JUMP_ABSOLUTE 12
>> 60 POP_BLOCK
>> 62 LOAD_CONST 0 (None)
64 RETURN_VALUE
In [5]: dis.dis(f2)
2 0 SETUP_LOOP 56 (to 58)
2 LOAD_GLOBAL 0 (os)
4 LOAD_ATTR 1 (walk)
6 LOAD_FAST 0 (top)
8 CALL_FUNCTION 1
10 GET_ITER
>> 12 FOR_ITER 42 (to 56)
14 UNPACK_SEQUENCE 3
16 STORE_FAST 1 (root)
18 STORE_FAST 2 (dirs)
20 STORE_FAST 3 (files)
3 22 SETUP_LOOP 30 (to 54)
24 LOAD_FAST 3 (files)
26 GET_ITER
>> 28 FOR_ITER 22 (to 52)
30 STORE_FAST 4 (f)
4 32 LOAD_GLOBAL 2 (print)
34 LOAD_GLOBAL 0 (os)
36 LOAD_ATTR 3 (path)
38 LOAD_ATTR 4 (join)
40 LOAD_FAST 1 (root)
42 LOAD_FAST 4 (f)
44 CALL_FUNCTION 2
46 CALL_FUNCTION 1
48 POP_TOP
50 JUMP_ABSOLUTE 28
>> 52 POP_BLOCK
>> 54 JUMP_ABSOLUTE 12
>> 56 POP_BLOCK
>> 58 LOAD_CONST 0 (None)
60 RETURN_VALUE
所以第一个代码确实产生了更多的字节码指令。
无论如何,您应该考虑profiling - 确保您查看真正相关的部分代码并避免盲目优化。