在Windows上多处理python的最佳方法

时间:2014-08-06 12:13:07

标签: python windows multiprocessing

我必须执行一个程序来读取PDF,将每个页面转换为PNG文件,然后为每个图像的每个页面并行运行一些代码。我正在寻找创建父进程的方法,具有恒定数量的子进程。

父亲的过程必须将工作交给孩子们。如果父亲有21个页面要处理,并且只有5个孩子,那么父亲必须管理一个队列来发送工作而不杀死5个孩子并创建新的孩子。当孩子完成他的工作时,他会向父亲发送一条信息给他送新工作。

我不想杀死孩子,因为我觉得比杀人更快,并且创造新的子过程或孩子。我的方向错了?

我正在尝试使用multiprocessing.apply_async来做这件事,但我找不到做我需要的方法。

一些建议或教程?

抱歉我的英语不好

我正在尝试做的一些代码:

from multiprocessing  import Pool
import time
import random

def BarcodeSearcher(x):
    #Here goes the image processing    
    return x*x 

def resultCollector(result):
    print result

def main():
    pool = Pool(processes=3)
    for pag in range(3):
        pool.apply_async(BarcodeSearcher, args = (pag, ), callback = resultCollector) 
    pool.close()
    pool.join()

if __name__ == '__main__':
    main()

1 个答案:

答案 0 :(得分:1)

你目前正在做的方式应该可以正常工作。 multiprocessing.Pool创建一个具有恒定数量的工作进程的池,所有这些进程将在Pool的生命周期内保持活动状态。 Pool有一个内部队列,用于在其中一个完成工作后立即将工作项发送到工作进程。因此,您需要做的就是将您想要完成的所有工作提供给Pool,然后Pool将处理所有工作分配给您的三个工作进程。

考虑你的例子,除了现在我们正在为它提供30个工作项目:

from multiprocessing import Pool, current_process
import time
import random

def BarcodeSearcher(x):
    print ("Process %s: handling %s" % (current_process(), x)
    #Here goes the image processing    
    return x*x 

def resultCollector(result):
    print result

def main():
    pool = Pool(processes=3)
    for pag in range(30):
        pool.apply_async(BarcodeSearcher, args = (pag, ), callback = resultCollector) 
    pool.close()
    pool.join()

if __name__ == '__main__':
    main()

这是输出:

Process <Process(PoolWorker-1, started daemon)>: handling 0
Process <Process(PoolWorker-3, started daemon)>: handling 2
Process <Process(PoolWorker-1, started daemon)>: handling 3
Process <Process(PoolWorker-1, started daemon)>: handling 4
Process <Process(PoolWorker-3, started daemon)>: handling 5
0
Process <Process(PoolWorker-2, started daemon)>: handling 1
9
4
Process <Process(PoolWorker-1, started daemon)>: handling 6
Process <Process(PoolWorker-1, started daemon)>: handling 7
16
Process <Process(PoolWorker-2, started daemon)>: handling 8
25
1
Process <Process(PoolWorker-3, started daemon)>: handling 9
36
49
64
81
Process <Process(PoolWorker-1, started daemon)>: handling 10
100
Process <Process(PoolWorker-2, started daemon)>: handling 11
Process <Process(PoolWorker-3, started daemon)>: handling 12
121
144
Process <Process(PoolWorker-2, started daemon)>: handling 13
Process <Process(PoolWorker-1, started daemon)>: handling 14
169
Process <Process(PoolWorker-2, started daemon)>: handling 15
196
Process <Process(PoolWorker-1, started daemon)>: handling 16
225
Process <Process(PoolWorker-3, started daemon)>: handling 17
256
Process <Process(PoolWorker-3, started daemon)>: handling 18
Process <Process(PoolWorker-1, started daemon)>: handling 19
Process <Process(PoolWorker-1, started daemon)>: handling 20
289
Process <Process(PoolWorker-1, started daemon)>: handling 21
324
Process <Process(PoolWorker-3, started daemon)>: handling 22
361
Process <Process(PoolWorker-3, started daemon)>: handling 24
400
Process <Process(PoolWorker-1, started daemon)>: handling 25
441
Process <Process(PoolWorker-3, started daemon)>: handling 26
Process <Process(PoolWorker-1, started daemon)>: handling 27
484
576
Process <Process(PoolWorker-3, started daemon)>: handling 28
Process <Process(PoolWorker-1, started daemon)>: handling 29
625
676
729
784
841
Process <Process(PoolWorker-2, started daemon)>: handling 23
529

正如您所看到的,工作是在您的工作人员之间分配的,而您无需做任何特别的事情。