如何杀死使用多处理池imap_unordered时产生的线程

时间:2018-05-19 22:52:51

标签: python multithreading multiprocessing

我试图使用多处理池来加速简单的Python程序。具体来说:imap_unordered函数。

在我的情况下,我正在搜索具有特定属性的特定对象,并且检查此属性需要很长时间,因此我想将负载分散到我的CPU核心上。

我创建了以下代码:

from multiprocessing import Pool as ThreadPool 
pool = ThreadPool(4) 

some_iterator = (create_item() for _ in range(100000))

results = pool.imap_unordered(my_function, some_iterator)

for result in results:
  if is_favourable(result):
    break

不幸的是,在调用break之后,线程中仍然有很多活动(在我的计算机活动监视器中可以看到)。在找到有利的结果之前,我应该如何继续搜索结果,或者如何使用imap_unordered迭代器停止迭代所有项目?

2 个答案:

答案 0 :(得分:1)

Pool.terminate()将立即停止工作流程,而Pool.close()将停止提交任务,并且一旦当前任务完成,流程将关闭。

如果Pool.terminate()实例被垃圾收集,或者将其与Pool一起使用,也会调用

with,因此以下是一个解决方案:

import multiprocessing as mp
import time

def my_function(item):
    print(mp.current_process().name,item)
    time.sleep(2) # imitate a long process
    return item * 2

def is_favourable(item):
    return item == 20   # something to look for (result of item 10)

def find():
    with mp.Pool() as pool:
        some_iterator = range(100)
        results = pool.imap_unordered(my_function, some_iterator)
        for result in results:
            print(result)
            if is_favourable(result):
                return result  # pool will be terminated exiting with.

if __name__ == '__main__':
    start = time.time()
    find()
    print(time.time() - start)

单个线程会在22秒内找到第10项。在我的8核系统上,它在~4秒内找到它:

SpawnPoolWorker-2 0
SpawnPoolWorker-3 1
SpawnPoolWorker-1 2
SpawnPoolWorker-5 3
SpawnPoolWorker-4 4
SpawnPoolWorker-8 5
SpawnPoolWorker-7 6
SpawnPoolWorker-6 7
SpawnPoolWorker-1 8
SpawnPoolWorker-3 9
SpawnPoolWorker-2 10
4
2
0
8
SpawnPoolWorker-4 11
SpawnPoolWorker-8 12
10
SpawnPoolWorker-5 13
6
12
SpawnPoolWorker-7 14
SpawnPoolWorker-6 15
14
SpawnPoolWorker-3 16
18
SpawnPoolWorker-1 17
SpawnPoolWorker-2 18
16
20
4.203129768371582

答案 1 :(得分:1)

对于初学者,您的示例代码使用multiprocessing ThreadPool因为您的import语句错误(它只是有效地重命名常规{{1}那个类)。

无论如何,您可以使用Pool / Pool作为自Python 3.3以来的上下文管理器并将循环放在其中。这将导致在退出上下文时自动调用其terminate()方法(由于下面示例中的ThreadPool语句)。

break

如果您使用的是旧版本的Python,则可以在from multiprocessing import current_process from multiprocessing.pool import ThreadPool from random import randint import time def create_item(): return randint(0, 20) def is_favourable(value): return value < 20 def my_function(value): print(current_process().name, value) time.sleep(2) return value * 2 if __name__ == '__main__': with ThreadPool(4) as pool: # Use as context manager (Python 3.3+) some_iterator = (create_item() for _ in range(10000)) start = time.time() results = pool.imap_unordered(my_function, some_iterator) for result in results: print('result:', result) if is_favourable(result): break # Stop loop and exit Pool context. print('done') print(time.time() - start) 语句之前立即显式调用pool.terminate()(而不是使用break语句。)