Question

我无法弄清楚的是，尽管ThreadPoolExecutor使用守护进程工作者，但即使主线程退出，它们仍会运行。

我可以在python3.6.4中提供一个最小的例子：

import concurrent.futures
import time


def fn():
    while True:
        time.sleep(5)
        print("Hello")


thread_pool = concurrent.futures.ThreadPoolExecutor()
thread_pool.submit(fn)
while True:
    time.sleep(1)
    print("Wow")

主线程和工作线程都是无限循环。因此，如果我使用KeyboardInterrupt来终止主线程，我希望整个程序也会终止。但实际上工作线程仍在运行，即使它是一个守护程序线程。

ThreadPoolExecutor的源代码确认工作线程是守护程序线程：

t = threading.Thread(target=_worker,
                     args=(weakref.ref(self, weakref_cb),
                           self._work_queue))
t.daemon = True
t.start()
self._threads.add(t)

此外，如果我手动创建一个守护程序线程，它就像一个魅力：

from threading import Thread
import time


def fn():
    while True:
        time.sleep(5)
        print("Hello")


thread = Thread(target=fn)
thread.daemon = True
thread.start()
while True:
    time.sleep(1)
    print("Wow")

所以我真的无法弄清楚这种奇怪的行为。

Answer 1

突然......我找到了原因。根据{{1}}的更多源代码：

ThreadPoolExecutor

有一个退出处理程序，它将加入所有未完成的工作人员......

Answer 2

这是避免此问题的方法。错误的设计可以被另一个错误的设计击败。人们只有在真正知道工作人员不会损坏任何对象或文件的情况下才写daemon=True。

对于我来说，我是用一个工作线程创建的TreadPoolExecutor，并且在一个submit之后，我刚刚从队列中删除了新创建的线程，因此解释器不会等到该线程停止在其线程上拥有。请注意，工作线程是在submit之后创建的，而不是在TreadPoolExecutor初始化之后创建的。

import concurrent.futures.thread
from concurrent.futures import ThreadPoolExecutor

...

executor = ThreadPoolExecutor(max_workers=1)
future = executor.submit(lambda: self._exec_file(args))
del concurrent.futures.thread._threads_queues[list(executor._threads)[0]]

它可以在Python 3.8中使用，但可能无法在3.9+中使用，因为此代码正在访问私有变量。

请参见代码on github

ThreadPoolExecutor中的worker不是真正的守护进程

2 个答案: