Odd threading behavior in python

时间:2016-02-03 03:36:39

标签: python multithreading scope

I have a problem where I need to pass the index of an array to a function which I define inline. The function then gets passed as a parameter to another function which will eventually call it as a callback.

The thing is, when the code gets called, the value of the index is all wrong. I eventually solved this by creating an ugly workaround but I am interested in understanding what is happening here. I created a minimal example to demonstrate the problem:

from __future__ import print_function
import threading


def works_as_expected():
    for i in range(10):
        run_in_thread(lambda: print('the number is: {}'.format(i)))

def not_as_expected():
    for i in range(10):
        run_later_in_thread(lambda: print('the number is: {}'.format(i)))

def run_in_thread(f):
    threading.Thread(target=f).start()

threads_to_run_later = []
def run_later_in_thread(f):
    threads_to_run_later.append(threading.Thread(target=f))


print('this works as expected:\n')
works_as_expected()

print('\nthis does not work as expected:\n')
not_as_expected()
for t in threads_to_run_later: t.start()

Here is the output:

this works as expected:

the number is: 0
the number is: 1
the number is: 2
the number is: 3
the number is: 4
the number is: 6
the number is: 7
the number is: 7
the number is: 8
the number is: 9

this does not work as expected:

the number is: 9
the number is: 9
the number is: 9
the number is: 9
the number is: 9
the number is: 9
the number is: 9
the number is: 9
the number is: 9
the number is: 9

Can someone explain what is happening here? I assume it has to do with enclosing scope or something, but an answer with a reference that explains this dark (to me) corner of python scoping would be valuable to me.

I'm running this on python 2.7.11

2 个答案:

答案 0 :(得分:3)

这是关闭和范围在python中如何工作的结果。

正在发生的事情是i绑定在not_as_expected函数的范围内。因此,即使您向线程提供lambda函数,它所使用的变量也在每个lambda和每个线程之间共享。

考虑这个例子:

def make_function():
    i = 1
    def inside_function():
        print i
    i = 2
    return inside_function

f = make_function()
f()

您认为它会打印多少个?定义函数前的i = 1或之后的i = 2

它将打印i当前值(即2)。制作函数时i的值是什么并不重要,它始终使用当前值。您的lambda函数也会发生同样的事情。

即使在您的预期结果中,您也可以看到它并不总是正常工作,它会跳过5并显示7两次。在这种情况下发生的是每个lambda在循环进入下一次迭代之前通常运行。但在某些情况下(如5),循环设法在控制传递给其他线程之一之前经历两次迭代,并且i递增两次并跳过一个数字。在其他情况下(如7),两个线程设法在循环仍处于同一迭代中时运行,并且由于i在两个线程之间不变,因此将打印相同的值。

如果您改为这样做:

def function_maker(i):
    return lambda: print('the number is: {}'.format(i))

def not_as_expected():
    for i in range(10):
        run_later_in_thread(function_maker(i))

i变量与function_maker函数一起绑定在lambda内。每个lambda函数都将引用一个不同的变量,它将按预期工作。

答案 1 :(得分:2)

Python中的闭包捕获了免费的变量,而不是它们在创建闭包时的当前值。例如:

def make_closures():
    L = []

    # Captures variable L
    def push(x):
        L.append(x)
        return len(L)

    # Captures the same variable
    def pop():
        return L.pop()

    return push, pop

pushA, popA = make_closures()
pushB, popB = make_closures()

pushA(10); pushB(20); pushA(30); pushB(40)
print(popA(), popA(), popB(), popB())

将显示30,10,40,20:这是因为第一对闭包pushApopA将引用一个列表L而第二对pushB },popB将引用另一个独立列表。

重要的一点是,每对pushpop闭包引用相同的列表,即它们捕获变量 L而不是<创建时L的强>值。如果L被一个闭包变异,另一个将看到变化。

一个常见的错误就是例如期待

L = []
for i in range(10):
    L.append(lambda : i)
for x in L:
    print(x())

将显示0到9之间的数字...这里所有未命名的闭包都捕获了用于循环的相同变量i,并且所有这些变量在调用时都会返回相同的值。

解决此问题的常见Python习惯用语是

L.append(lambda i=i: i)

即。使用在创建函数时评估参数的默认值的事实。使用这种方法,每个闭包将返回一个不同的值,因为它们返回其私有局部变量(具有默认值的参数)。