使用队列和工作池进行多处理,让工作人员扩展队列

时间:2017-03-08 13:42:14

标签: python python-multiprocessing

我正在尝试理解Python中的多处理,但我目前正在努力解决以下问题:

从一个工作池开始我想将生成器函数中的对象提供给队列,然后由工作者使用。这工作正常,但我现在想扩展我的程序,以允许工作人员将工作添加到队列中。然而,这是我遇到问题的部分,因为我在第一个循环中添加的工作后面紧跟着第二个循环中添加的停止代码(参见示例代码)。这意味着任何工人添加的任何额外工作都将永远不会被执行......

我认为唯一需要的是检查队列是否为空并且没有任何工作人员正在做任何事情的方法,然后继续执行最后一个停止工作人员的循环。然而,我不知道如何检查工人的状态来做到这一点。

显示示例的最小代码:

import multiprocessing, time, random

def f(queue):
    worker_name = multiprocessing.current_process().name
    print "Started: {}".format(worker_name)

    while True:
        value = queue.get()
        if value is None:
            break

        print "{} is processing '{}'".format(worker_name, value)
        # compute(value)
        time.sleep(1)

        # Worker may add additional work to queue
        if random.random() > 0.7:
            queue.put("Extra work!")

    print "Stopping: {}".format(worker_name)


n_workers = 4
queue = multiprocessing.Queue()
pool = multiprocessing.Pool(n_workers, f, (queue,))

# Feed large objects from generator
for i in xrange(20):
    queue.put(i)

# All extra work is skipped

# Terminate workers after finishing work
for __ in xrange(n_workers):
    queue.put(None)

pool.close()
pool.join()

print "Finished!"
print queue.get() # Will yield 'Extra Work!' should be empty

1 个答案:

答案 0 :(得分:0)

使用计数信号量值我设法实现了我想要的。我通过递增/递减此值跟踪每个工作人员的活动,并尽快停止工作人员:队列为空工人不再处理任何事情。

感谢任何建议。

示例代码:

import multiprocessing, time, random

def f(queue, semaphore):
    worker_name = multiprocessing.current_process().name
    print "Started: {}".format(worker_name)

    while True:
        value = queue.get()
        if value is None:
            break

        with semaphore.get_lock():
            semaphore.value -= 1

        print "{} is processing '{}'".format(worker_name, value)
        # compute(value)
        time.sleep(1)

        # Worker may add additional work to queue
        if random.random() > 0.7:
            queue.put("Extra work!")

        with semaphore.get_lock():
            semaphore.value += 1

    print "Stopping: {}".format(worker_name)


n_workers = 4
semaphore = multiprocessing.Value('i', n_workers)
queue = multiprocessing.Queue()
pool = multiprocessing.Pool(n_workers, f, (queue, semaphore))

# Feed large objects from generator
for i in xrange(20):
    queue.put(i)

while not queue.empty() or semaphore.value != n_workers:
    time.sleep(0.2)

# Terminate workers after finishing work
for __ in xrange(n_workers):
    queue.put(None)

pool.close()
pool.join()

print "Finished!"
print queue.empty() # True