确定ThreadPool何时完成处理队列

时间:2011-10-28 01:14:02

标签: python queue threadpool multiprocessing

我正在尝试使用ThreadPoolQueue来实现处理任务队列的线程池。它从一个初始任务队列开始,然后每个任务也可以将其他任务推送到任务队列。问题是我不知道如何阻塞,直到队列为空并且线程池已经完成处理,但仍然检查队列并将任何新任务提交到被推入队列的线程池。我不能简单地调用ThreadPool.join(),因为我需要让池保持开放以执行新任务。

例如:

from multiprocessing.pool import ThreadPool
from Queue import Queue
from random import random
import time
import threading

queue = Queue()
pool = ThreadPool()
stdout_lock = threading.Lock()

def foobar_task():
    with stdout_lock: print "task called" 
    if random() > .25:
        with stdout_lock: print "task appended to queue"
        queue.append(foobar_task)
    time.sleep(1)

# set up initial queue
for n in range(5):
    queue.put(foobar_task)

# run the thread pool
while not queue.empty():
    task = queue.get() 
    pool.apply_async(task)

with stdout_lock: print "pool is closed"
pool.close()
pool.join()

输出:

pool is closed
task called
task appended to queue
task called
task appended to queue
task called
task appended to queue
task called
task appended to queue
task called
task appended to queue

这在foobar_tasks附加到队列之前退出while循环,因此附加的任务永远不会提交给线程池。我找不到任何方法来确定线程池是否仍有任何活动的工作线程。我尝试了以下方法:

while not queue.empty() or any(worker.is_alive() for worker in pool._pool):
    if not queue.empty():
        task = queue.get() 
        pool.apply_async(task)
    else:   
        with stdout_lock: print "waiting for worker threads to complete..."
        time.sleep(1)

但似乎worker.is_alive()总是返回true,所以这会进入无限循环。

有更好的方法吗?

1 个答案:

答案 0 :(得分:2)

  1. 处理完每个任务后调用queue.task_done
  2. 然后你可以调用queue.join()来阻止主线程直到所有 任务已经完成。
  3. 要终止工作线程,请将一个标记(例如None)放入队列中, 并且当foobar_task收到哨兵时,while-loop会突破threading.Thread
  4. 我认为使用ThreadPool比使用import random import time import threading import logging import Queue logger=logging.getLogger(__name__) logging.basicConfig(level=logging.DEBUG) sentinel=None queue = Queue.Queue() num_threads = 5 def foobar_task(queue): while True: n = queue.get() logger.info('task called: {n}'.format(n=n)) if n is sentinel: break n=random.random() if n > .25: logger.info("task appended to queue") queue.put(n) queue.task_done() # set up initial queue for i in range(num_threads): queue.put(i) threads=[threading.Thread(target=foobar_task,args=(queue,)) for n in range(num_threads)] for t in threads: t.start() queue.join() for i in range(num_threads): queue.put(sentinel) for t in threads: t.join() logger.info("threads are closed") 更容易实现。

  5. {{1}}