在python中通过队列传输文件对象

时间:2015-03-19 11:17:24

标签: python multiprocessing temporary-files

我有多处理任务,处理输入数据并将结果写入临时文件(以后再使用)。但是,当我尝试通过队列将文件句柄传输到父进程时,它会失败(不会引发异常,但队列仍为空)。

import multiprocessing, tempfile

def worker(i):
    my_data_object = []
    my_tmp_file = tempfile.NamedTemporaryFile('wb')
    my_tmp_file.write(bytes('Hello world #{}'.format(i), 'utf-8'))
    my_tmp_file.seek(0)
    queue.put(my_tmp_file)

queue = multiprocessing.Queue()

print('Writing...')
proc = []
for i in range(16):
    proc.append(multiprocessing.Process(target = worker, args = (i, )))
    proc[i].start()
for p in proc:
    p.join()

print('Reading...')
my_strings = []
while True:
    try:
        tmp_file = queue.get_nowait()
    except:
        print('All data are read. Queue is now empty')
        break
    my_strings.append(tmp_file.read())
    tmp_file.close()

print('Files content: ', my_strings)
print('Successful termination')

有人知道解决方案吗?

1 个答案:

答案 0 :(得分:0)

如果您在工作人员功能中调用读取并在工作结束后关闭,则保持打开的文件似乎会导致问题:

from multiprocessing import Process, Queue

def worker(i,queue):
    my_tmp_file = tempfile.NamedTemporaryFile()
    my_tmp_file.write(bytes('Hello world #{}'.format(i), 'utf-8'))
    my_tmp_file.seek(0)
    queue.put(my_tmp_file.read())
    my_tmp_file.close()

q = Queue()

processes = [Process(target=worker, args=(i, q)) for i in range(16)]

for p in processes:
    p.start()

for p in processes:
    p.join()

while q.qsize():
    out = q.get()
    print(out)

如果您尝试关闭文件对象而未阅读,则会在不可启用的内容中将TypeError: cannot serialize '_io.FileIO' object作为_io.FileIO对象。

根据你想要做的事情可能会有什么帮助,将.name放在队列上并将delete设置为False并重新打开文件:

import multiprocessing, tempfile

def worker(i):
    with tempfile.NamedTemporaryFile(delete=False) as my_tmp_file:
        my_tmp_file.write(bytes('Hello world #{}'.format(i), 'utf-8'))
        my_tmp_file.seek(0)
        queue.put(my_tmp_file.name)

queue = multiprocessing.Queue()

print('Writing...')
proc = []
for i in range(16):
    proc.append(multiprocessing.Process(target = worker, args = (i, )))
    proc[i].start()
for p in proc:
    p.join()

print('Reading...')
my_strings = []
while True:
    try:
        tmp_file = queue.get_nowait()
    except Exception as e:
        print('All data are read. Queue is now empty')
        break
    with open(tmp_file) as f:
        my_strings.append(f)

但是你仍然需要重新打开文件,所以不确定会有什么好处。