Question

我想将numpy数组传递给多处理队列。该程序正在使用小尺寸阵列（20x20），但更大的尺寸不起作用。一般来说，我想通过尺寸为（100,1,16,12000）的4D张量。在Mac上运行python3.6。

代码示例：

import numpy as np
from multiprocessing import JoinableQueue, Process


class Writer(Process):
    def __init__(self,que):
        Process.__init__(self)
        self.queue=que

    def run(self):
        for i in range(10):
            data=np.random.randn(30,30)
            self.queue.put(data)
            print(i)


class Reader(Process):
    def __init__(self,que):
        Process.__init__(self)
        self.queue=que

    def run(self):
        while not(self.queue.empty()):
            result=self.queue.get()
            print(result)


def main():
    q = JoinableQueue()
    w=Writer(q)
    r=Reader(q)

    w.start()
    w.join()
    print("DONE WRITING")

    r.start()
    r.join()
    print("DONE READING")




if __name__ == "__main__":
    main()

Answer 1

python多处理队列不适合大型数组，因为它们在放入队列时需要进行pickle，而在从队列中获取时需要进行unickled，这会引入处理和内存开销。

我开发了一个小包，它使用内置的Python多处理Array类来存储数据。在后台使用队列来传递元数据。与我遇到的其他解决方案不同，它适用于Mac，Windows和Linux。您可以使用

进行安装

pip install arrayqueues

说明，来源和问题在github上：https://github.com/portugueslab/arrayqueues

对于简单的用例，它可以作为多处理队列的直接替换，主要区别在于必须指定队列将占用的内存量。

关于读者进程，据我所知，queue.empty（）被认为不可靠，鼓励采用以下模式：

from Queue import Empty # The Empty exception is defined in the normal queue class

# inside the process
while True:
    try:
        item = queue.get()
    except Empty:
        break

将大numpy数组（多维）传递给多处理队列

1 个答案: