Question

我有一个简单的示例问题，我在Python中苦苦挣扎。我正在使用多进程来执行函数＆＃34; Thread_Test（）＆＃34;将在0到1的间隔内生成一个统一的随机数组，其中包含＆＃34; Sample_Size＆＃34;数组中的数据点数。一旦我得到这个例子，我计划生成该过程的多个副本以试图加速代码执行，然后我将在Thread_Test（）中放置一组更复杂的计算。只要我将Sample_Size保持在9,000以下，此示例就可以正常工作。随着我将Sample_Size从10增加到8,000，执行时间增加，但是在8,000时代码只需要0.01秒执行。但是，只要我将Sample_Size增加到9,000，代码就会永远执行，永远不会完成计算。造成这种情况的原因是什么？

from multiprocessing import Process, Queue
import queue
import random
import timeit
import numpy as np

def Thread_Test(Sample_Size):
    q.put(np.random.uniform(0,1,Sample_Size))
    return

if __name__ == "__main__":
    Sample_Size = 9000
    q = Queue()
    start = timeit.default_timer()
    p = Process(target=Thread_Test,args=(Sample_Size,))
    p.start()
    p.join()

    result = np.array([])
    while True:
        if not q.empty():
         result = np.append(result,q.get())
        else:
           break
    print (result)

    stop = timeit.default_timer()
    print ('{}{:4.2f}{}'.format("Computer Time: ", stop-start, " seconds"))

Answer 1

问题发生的原因是，如果你在子进程中放入队列中的生成器（生产者），你必须保证主进程（使用者）同时获取元素。否则，主进程将在＆＃34; p.join（）＆＃34;中等待，而子进程在＆＃34; Queue.put＆＃34;中等待。因为队列中有太多的元素，没有消费者为新元素腾出更多空间。

作为doc here：

Bear in mind that a process that has put items in a queue will wait before terminating until 
all the buffered items are fed by the “feeder” thread to the underlying pipe

所以简单来说，你需要打电话给＃34;得到零件＆＃34;之前＆＃34; p.join（）＆＃34;。

如果您在子流程工作之前担心主流程退出，您可能会更改下面的代码：

while True:
    # check subprocess running before break
    running = p.is_alive()
    if not q.empty():
        result = np.append(result,q.get())
    else:
        if not running:
            break

整个部分如下：

def Thread_Test(q, Sample_Size):
    q.put(np.random.uniform(0,1,Sample_Size))


if __name__ == "__main__":
    Sample_Size = 9000
    q = Queue()
    start = timeit.default_timer()
    p = Process(target=Thread_Test,args=(q, Sample_Size,))
    p.daemon = True
    p.start()

    result = np.array([])
    while True:
        running = p.is_alive()
        if not q.empty():
            result = np.append(result,q.get())
        else:
            if not running:
                break
    p.join()
    stop = timeit.default_timer()
    print ('{}{:4.2f}{}'.format("Computer Time: ", stop-start, " seconds"))

我如何在Python中正确使用多进程

1 个答案: