使用多重处理写入多个文件。错误:“ TypeError:无法序列化'_io.TextIOWrapper'对象”

时间:2018-09-07 14:50:08

标签: python python-3.x multiprocessing

我正在尝试将多处理(4个内核/进程)的结果写入文件。由于CPU内核同时工作,因此我想到制作4个文件,0.txt1.txt2.txt3.txt,并将其保存在multiprocessing.Manager().list()中。但是我遇到了错误,TypeError: cannot serialize '_io.TextIOWrapper' object

def run_solver(total, proc_id, result, fouts):
    for i in range(10)):
        fouts[proc_id].write('hi\n')

if __name__ == '__main__':
    processes = []
    fouts = Manager().list((open('0.txt', 'w'), open('1.txt', 'w'), open('2.txt', 'w'), open('3.txt', 'w')))
    for proc_id in range(os.cpu_count()):
        processes.append(Process(target=run_solver, args=(int(total/os.cpu_count()), proc_id, result, fouts)))

    for process in processes:
        process.start()

    for process in processes:
        process.join()

    for i in range(len(fouts)):
        fouts[i].close()

我也试图在函数内部使用文件句柄填充列表,如下所示。

def run_solver(total, proc_id, result, fouts):
    fout[proc_id] = open(str(proc_id)+'.txt', 'w')
    for i in range(10)):
        fouts[proc_id].write('hi\n')
    fout[proc_id].close()

if __name__ == '__main__':
    processes = []
    fouts = Manager().list([0]*os.cpu_count())

两者都不起作用,我了解到有些问题与无法序列化或无法修复。但是我不知道如何解决这个问题。有人可以提出解决方案吗?

1 个答案:

答案 0 :(得分:1)

在每个进程中打开文件 。不要在管理器中打开它们,您不能将打开的文件从管理器进程发送到执行程序进程。

def run_solver(total, proc_id, result, fouts):
    with open(fouts[proc_id], 'w') as openfile:
        for i in range(10)):
            openfile.write('hi\n')

if __name__ == '__main__':
    processes = []
    with Manager() as manager:
        fouts = manager.list(['0.txt', '1.txt', '2.txt', '3.txt'])
        for proc_id in range(os.cpu_count()):
            processes.append(Process(
                target=run_solver, args=(
                    int(total/os.cpu_count()), proc_id, result, fouts)
            ))

如果要在进程之间共享文件名,则要防止在写入这些文件时出现争用情况,那么您确实也想对每个文件使用锁:

def run_solver(total, proc_id, result, fouts, locks):
    with open(fouts[proc_id], 'a') as openfile:
        for i in range(10)):
            with locks[proc_id]:
                openfile.write('hi\n')
                openfile.flush()


if __name__ == '__main__':
    processes = []
    with Manager() as manager:
        fouts = manager.list(['0.txt', '1.txt', '2.txt', '3.txt'])
        locks = manager.list([Lock() for fout in fouts])

        for proc_id in range(os.cpu_count()):
            processes.append(Process(
                target=run_solver, args=(
                    int(total/os.cpu_count()), proc_id, result, fouts, locks
                )
            ))

因为使用with打开文件,所以它们每次都会自动关闭,并且它们会以附加模式打开,因此不同的进程不会互相干扰。您确实需要记住在再次解锁之前刷新写缓冲区。

顺便说一句,您可能希望查看process pools,而不是自己进行手动池化。