我必须执行一个程序来读取PDF,将每个页面转换为PNG文件,然后为每个图像的每个页面并行运行一些代码。我正在寻找创建父进程的方法,具有恒定数量的子进程。
父亲的过程必须将工作交给孩子们。如果父亲有21个页面要处理,并且只有5个孩子,那么父亲必须管理一个队列来发送工作而不杀死5个孩子并创建新的孩子。当孩子完成他的工作时,他会向父亲发送一条信息给他送新工作。
我不想杀死孩子,因为我觉得比杀人更快,并且创造新的子过程或孩子。我的方向错了?
我正在尝试使用multiprocessing.apply_async来做这件事,但我找不到做我需要的方法。
一些建议或教程?
抱歉我的英语不好
我正在尝试做的一些代码:
from multiprocessing import Pool
import time
import random
def BarcodeSearcher(x):
#Here goes the image processing
return x*x
def resultCollector(result):
print result
def main():
pool = Pool(processes=3)
for pag in range(3):
pool.apply_async(BarcodeSearcher, args = (pag, ), callback = resultCollector)
pool.close()
pool.join()
if __name__ == '__main__':
main()
答案 0 :(得分:1)
你目前正在做的方式应该可以正常工作。 multiprocessing.Pool
创建一个具有恒定数量的工作进程的池,所有这些进程将在Pool
的生命周期内保持活动状态。 Pool
有一个内部队列,用于在其中一个完成工作后立即将工作项发送到工作进程。因此,您需要做的就是将您想要完成的所有工作提供给Pool
,然后Pool
将处理所有工作分配给您的三个工作进程。
考虑你的例子,除了现在我们正在为它提供30个工作项目:
from multiprocessing import Pool, current_process
import time
import random
def BarcodeSearcher(x):
print ("Process %s: handling %s" % (current_process(), x)
#Here goes the image processing
return x*x
def resultCollector(result):
print result
def main():
pool = Pool(processes=3)
for pag in range(30):
pool.apply_async(BarcodeSearcher, args = (pag, ), callback = resultCollector)
pool.close()
pool.join()
if __name__ == '__main__':
main()
这是输出:
Process <Process(PoolWorker-1, started daemon)>: handling 0
Process <Process(PoolWorker-3, started daemon)>: handling 2
Process <Process(PoolWorker-1, started daemon)>: handling 3
Process <Process(PoolWorker-1, started daemon)>: handling 4
Process <Process(PoolWorker-3, started daemon)>: handling 5
0
Process <Process(PoolWorker-2, started daemon)>: handling 1
9
4
Process <Process(PoolWorker-1, started daemon)>: handling 6
Process <Process(PoolWorker-1, started daemon)>: handling 7
16
Process <Process(PoolWorker-2, started daemon)>: handling 8
25
1
Process <Process(PoolWorker-3, started daemon)>: handling 9
36
49
64
81
Process <Process(PoolWorker-1, started daemon)>: handling 10
100
Process <Process(PoolWorker-2, started daemon)>: handling 11
Process <Process(PoolWorker-3, started daemon)>: handling 12
121
144
Process <Process(PoolWorker-2, started daemon)>: handling 13
Process <Process(PoolWorker-1, started daemon)>: handling 14
169
Process <Process(PoolWorker-2, started daemon)>: handling 15
196
Process <Process(PoolWorker-1, started daemon)>: handling 16
225
Process <Process(PoolWorker-3, started daemon)>: handling 17
256
Process <Process(PoolWorker-3, started daemon)>: handling 18
Process <Process(PoolWorker-1, started daemon)>: handling 19
Process <Process(PoolWorker-1, started daemon)>: handling 20
289
Process <Process(PoolWorker-1, started daemon)>: handling 21
324
Process <Process(PoolWorker-3, started daemon)>: handling 22
361
Process <Process(PoolWorker-3, started daemon)>: handling 24
400
Process <Process(PoolWorker-1, started daemon)>: handling 25
441
Process <Process(PoolWorker-3, started daemon)>: handling 26
Process <Process(PoolWorker-1, started daemon)>: handling 27
484
576
Process <Process(PoolWorker-3, started daemon)>: handling 28
Process <Process(PoolWorker-1, started daemon)>: handling 29
625
676
729
784
841
Process <Process(PoolWorker-2, started daemon)>: handling 23
529
正如您所看到的,工作是在您的工作人员之间分配的,而您无需做任何特别的事情。