Question

情况：我有一个用Python编写的文件处理器。文件将是＆＃34;走路＆＃34;并被列入队列。然后使用multirocessing

处理它

问题：请参阅下面的代码

fileA.py
==========
import Queue
import os
def walker():
    filelist = Queue.Queue()
    queue_end = Object()
    for root, dirs, files in os.walk('/'):
        for f in files:
            path = os.path.join(root,f)
            if not os.path.islink(path):
                filelist.put(path)
    filelist.put(queue_end)

fileB.py
===========
import fileA
import os
import multiprocessing as mp

def processor(queuelock):
    while True:
        with queuelock:
            filepath = fileA.filelist.get()

            if filepath is fileA.queue_end:
                filelist.put(queue_end)
                break
        #example of a job
        os.move(filepath, "/home/newuser" + filepath)
        print filepath + " has been moved!"

if __name__ == '__main__':
    fileA.walker()
    queuelock = mp.Lock()
    jobs = []
    for i in range(0,mp.cpu_count()):
        process = mp.Process(target=processor(queuelock))
        jobs.append(process)
        process.start()

问题是当文件被移动时，所有进程都会尝试移动EXACT相同的文件，即使它已经从队列中删除了。

示例输出：

randomFile as been moved!
Error: ......... randomFile not found
Error: ......... randomFile not found
Error: ......... randomFile not found

从而暗示产生的每个进程都将完全相同的文件出列，并尝试在同一文件上执行相同的过程。

问题：我出错的是由于某种原因，filelist队列已经发送到每个进程（现在发生了什么），而不是由filelist队列共享所有过程（我的预期结果）？

Answer 1

filelist目前只是walker()的局部变量，并且队列对象不与代码的其他部分共享，因此至少需要return filelist walker()。
要在多个进程之间共享同一队列，需要multiprocessing.Queue。分叉（或生成）进程时会复制queue.Queue，因此它将成为每个进程的新独立队列。

使用队列解决多处理问题

1 个答案: