Question

我正在尝试使用多处理来返回一个列表，但是我没有等到所有进程完成，而是从mp_factorizer中的一个return语句中获得了几个返回，如下所示：

None
None
(returns list)

在这个例子中，我使用了2个线程。如果我使用了5个线程，那么在列表被放出之前会有5个无回报。这是代码：

def mp_factorizer(nums, nprocs, objecttouse):
    if __name__ == '__main__':
        out_q = multiprocessing.Queue()
        chunksize = int(math.ceil(len(nums) / float(nprocs)))
        procs = []
        for i in range(nprocs):
            p = multiprocessing.Process(
                    target=worker,                   
                    args=(nums[chunksize * i:chunksize * (i + 1)],
                          out_q,
                    objecttouse))
            procs.append(p)
            p.start()

        # Collect all results into a single result dict. We know how many dicts
        # with results to expect.
        resultlist = []
        for i in range(nprocs):
            temp=out_q.get()
            index =0
            for i in temp:
                resultlist.append(temp[index][0][0:])
                index +=1

        # Wait for all worker processes to finish
        for p in procs:
            p.join()
            resultlist2 = [x for x in resultlist if x != []]
        return resultlist2

def worker(nums, out_q, objecttouse):
    """ The worker function, invoked in a process. 'nums' is a
        list of numbers to factor. The results are placed in
        a dictionary that's pushed to a queue.
    """
    outlist = []
    for n in nums:        
        outputlist=objecttouse.getevents(n)
        if outputlist:
            outlist.append(outputlist)   
    out_q.put(outlist)

mp_factorizer获取项目列表，线程数和工作者应使用的对象，然后拆分项目列表，以便所有线程获得相同数量的列表，并启动工作程序。然后，工作人员使用该对象计算给定列表中的内容，将结果添加到队列中。 Mp_factorizer应该从队列中收集所有结果，将它们合并到一个大列表中并返回该列表。但是 - 我得到多个回报。

我做错了什么？或者这是由于窗口处理多处理的奇怪方式而导致的预期行为？（Python 2.7.3，Windows7 64bit）

编辑：问题是if __name__ == '__main__':的放置错误。我在处理其他问题时发现了，请参阅using multiprocessing in a sub process以获取完整的解释。

Answer 1

if __name__ == '__main__'位置错误。快速解决方法是仅保护对Jan_ Karila建议的mp_factorizer的调用：

if __name__ == '__main__':
    print mp_factorizer(list, 2, someobject)

然而，在Windows上，主文件将在执行时执行一次+每个工作线程执行一次，在本例中为2.因此，这将是主线程的总共3次执行，不包括代码的受保护部分。

一旦在同一主线程中进行其他计算，这可能会导致问题，并且至少会不必要地降低性能。即使只有多次执行worker函数，在windows中一切都将被执行，但不受if __name__ == '__main__'的保护。

所以解决方案是通过以后执行所有代码来保护整个主进程 if __name__ == '__main__'。

但是，如果worker函数位于同一文件中，则需要从此if语句中排除它，否则无法多次调用它进行多处理。

伪代码主线程：

# Import stuff
if __name__ == '__main__':
    #execute whatever you want, it will only be executed 
    #as often as you intend it to
    #execute the function that starts multiprocessing, 
    #in this case mp_factorizer()
    #there is no worker function code here, it's in another file.

即使整个主进程受到保护，只要它在另一个文件中，就可以启动worker函数。

伪代码主线程，带有worker函数：

# Import stuff
#If the worker code is in the main thread, exclude it from the if statement:
def worker():
    #worker code
if __name__ == '__main__':
    #execute whatever you want, it will only be executed 
    #as often as you intend it to
    #execute the function that starts multiprocessing, 
    #in this case mp_factorizer()
#All code outside of the if statement will be executed multiple times
#depending on the # of assigned worker threads.

有关可运行代码的更长解释，请参阅using multiprocessing in a sub process

Answer 2

您的if __name__ == '__main__'声明位置错误。将它放在print语句周围以防止子进程执行该行：

if __name__ == '__main__':
    print mp_factorizer(list, 2, someobject)

现在if里面有mp_factorizer，这使得函数在子进程内调用时返回None。

从python多处理函数返回的多个输出

2 个答案: