为什么这个lambda函数比for循环版本更懒惰?

时间:2014-02-21 02:17:58

标签: python performance lambda lazy-evaluation

正在撰写一篇关于某些python编码风格的博客文章,并且发现了一些我发现非常奇怪的内容,我想知道是否有人理解它发生了什么。基本上我有两个相同功能的版本:

a = lambda x: (i for i in range(x))
def b(x):
    for i in range(x):
        yield i

我想比较这两个刚刚设置的性能。在我看来,这应该涉及可忽略不计的计算量,并且两种方法都应该接近于零,但是,当我实际运行时间时:

def timing(x, number=10):
    implicit = timeit.timeit('a(%s)' % int(x), 'from __main__ import a', number=number)
    explicit = timeit.timeit('b(%s)' % int(x), 'from __main__ import b', number=number)
    return (implicit, explicit)

def plot_timings(*args, **kwargs):
    fig = plt.figure()
    ax = fig.add_subplot(1,1,1)
    x_vector = np.linspace(*args, **kwargs)
    timings = np.vectorize(timing)(x_vector)
    ax.plot(x_vector, timings[0], 'b--')
    ax.plot(x_vector, timings[1], 'r--')
    ax.set_yscale('log')
    plt.show()

plot_timings(1, 1000000, 20)

我在两种方法之间得到了巨大的差异,如下所示:

<code>a</code> is in blue, <code>b</code> is in red

a为蓝色,b为红色。

为什么差异如此巨大?看起来显式的for循环版本也在以对数方式增长,而隐式版本什么都不做(应该如此)。

有什么想法吗?

2 个答案:

答案 0 :(得分:2)

差异是由range

引起的

a在构建时需要调用范围 b在第一次迭代之前不需要调用范围

>>> def myrange(n):
...     print "myrange(%s)"%n
...     return range(n)
... 
>>> a = lambda x: (i for i in myrange(x))
>>> def b(x):
...     for i in myrange(x):
...         yield i
... 
>>> a(100)
myrange(100)
range(100)
<generator object <genexpr> at 0xd62d70>
>>> b(100)
<generator object b at 0xdadb90>
>>> next(_)   # <-- first iteration of b(100)
myrange(100)
range(100)
0

答案 1 :(得分:0)

lambda调用很慢。看看这个:

import cProfile

a = lambda x: (i for i in range(x))

def b(x):
    for i in range(x):
        yield i

def c(x):
    for i in xrange(x):
        yield i

def d(x):
    i = 0
    while i < x:
        yield i
        i += 1


N = 100000
print " -- a --"
cProfile.run("""
for x in xrange(%i):
    a(x)
""" % N)

print " -- b --"
cProfile.run("""
for x in xrange(%i):
    b(x)
""" % N)

print " -- c --"
cProfile.run("""
for x in xrange(%i):
    c(x)
""" % N)

print " -- d --"
cProfile.run("""
for x in xrange(%i):
    d(x)
""" % N)

print " -- a (again) --"
cProfile.run("""
for x in xrange(%i):
    a(x)
""" % N)

给我以下结果:

 -- a --
         300002 function calls in 61.764 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1   30.881   30.881   61.764   61.764 <string>:3(<module>)
   100000    0.051    0.000    0.051    0.000 test.py:5(<genexpr>)
   100000    0.247    0.000   30.832    0.000 test.py:5(<lambda>)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
   100000   30.585    0.000   30.585    0.000 {range}


 -- b --
         100002 function calls in 0.076 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.066    0.066    0.076    0.076 <string>:3(<module>)
   100000    0.010    0.000    0.010    0.000 test.py:7(b)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}


 -- c --
         100002 function calls in 0.075 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.065    0.065    0.075    0.075 <string>:3(<module>)
   100000    0.010    0.000    0.010    0.000 test.py:11(c)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}


 -- d --
         100002 function calls in 0.075 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.065    0.065    0.075    0.075 <string>:3(<module>)
   100000    0.010    0.000    0.010    0.000 test.py:15(d)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}


 -- a (again) --
         300002 function calls in 60.890 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1   30.487   30.487   60.890   60.890 <string>:3(<module>)
   100000    0.049    0.000    0.049    0.000 test.py:5(<genexpr>)
   100000    0.237    0.000   30.355    0.000 test.py:5(<lambda>)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
   100000   30.118    0.000   30.118    0.000 {range}