Question

我有一个二进制文件，其中包含不变数量的图像（每个图像的大小为1024 * 768）。我将每个图像放入JoinableQueue并在多处理中对其进行了分析，它非常适合小文件，但是当尝试读取大文件时出现内存错误。有人知道如何将大文件存储到缓冲区/队列（作为字符串）吗？（很遗憾，我无法使用Manager或Pool）

Answer 1

您看过模块io.BytesIO吗？您可以在这里找到它：https://docs.python.org/release/3.1.3/library/io.html#binary-i-o 您可以设置缓冲区大小，这一次为我解决了内存问题。

Answer 2

您可以了解有关缓冲区here的信息。
如果您的内存较小，可以尝试像这样强制gc：

import gc

SIZE = 1024*768  
MEMOSIZE = 1024  # your memory size
with open('xxx', 'rb') as fp:  # open the file
    i = 0  # remember the number to gc in time
    queue = []
    while True:
        if (i*(SIZE-1) < MEMOSIZE):
            x = fp.read(SIZE)  # if your image is single channel
            queue.append(x)
            # do something
        else:
            del queue
            gc.collect()

多处理中的Python巨大文件读取

2 个答案: