我开发了一种基于机器学习的对象检测器,该对象检测器以不同的分辨率扫描图像,并从沿着图像滑动的“窗口”中提取特征。这是一个耗时的过程,但是图像的版本是独立的,因此此特征提取步骤可以并行完成。子图像的数量取决于原始图像的尺寸,因此该信息无法提前获得。计划是使用Python的“多重处理”包
遵循以下建议: https://docs.python.org/3/library/multiprocessing.html#multiprocessing-programming 和这里: Multiprocessing a for loop?,
我决定使用multiprocessing.Queue()方法。
不幸的是,我无法成功转换代码,所以我决定创建一个简单的案例来解决我的问题。它运行没有错误,并给出了预期的结果,但是我想知道是否有更好/更快/更多的Python方式来实现这一目标。另外,我期望每个进程都并行运行,但是我不确定。该代码输出本地进程父ID和进程ID,并且父ID都相同(5860),但进程ID不同。如何解释?进程是分布在我的CPU上还是像单个CPU上的线程一样分布?我通过Windows任务管理器浏览了4个CPU,并看到利用率同时上升和下降。幅度的变化不是很明显,并且与代码未运行时(例如,当我键入此消息时)同时发生的上升和下降没有太大差别
下面有我的代码及其输出。任何对如何改进代码和/或澄清实际情况的反馈都将不胜感激。
from multiprocessing import Process,Queue
import os
import numpy as np
np.set_printoptions(precision=2) # Used when printing matrix
class myob:
def __init__(self,val,N):
self.val= val
self.N=N
def printprocessinfo(self):
print("val = {}".format(self.val))
print('parent process:', os.getppid())
print('process id:', os.getpid())
def somelongprocess_mp(self):
# Take the inverse of a random matrix of size N
A=np.random.random((self.N, self.N))
np.fill_diagonal(A,100) # dominant diags will ensure invert.
return np.linalg.inv(A) * A # Should return identity matrix
def f(q,val,N):
a=myob(val,N)
a.printprocessinfo()
q.put(a.somelongprocess_mp())
def run_multiple_processes_using_lists():
numProcesses =5 # Number of processes
q=[] # Queue List
p=[] # Process List
d=[] # Output List
val_list=list(range(0,11)) # Values
xlist=[5,30,30,30,25,30,20,20,25,30,30,20] # Matrix Sizes (Hangs on 32+)
for i in range(0,numProcesses-1):
q.append(Queue())
p.append(Process(target=f,args=(q[i],val_list[i],xlist[i])))
print("*** Queue List ***")
print(q)
print("*** Process List ***")
print(p)
print("Start All Processes ...")
for j in range(0,numProcesses-1):
pp=p[j]
pp.start()
pp.join()
print("Collecting Results")
for k in range(0,numProcesses-1):
d.append(q[k].get())
print("Verify Output")
print(d[0]) # Print the first inverted matrices from the 1st process
print("Matrix Shapes")
for l in range(0,numProcesses-1):
print(d[l].shape)
if __name__ == '__main__':
run_multiple_processes_using_lists()
输出在这里:
(base) C:\Python Scripts>python Example.py
*** Queue List ***
[<multiprocessing.queues.Queue object at 0x0000020CEC1B4390>, <multiprocessing.queues.Queue object at 0x0000020CECE9B080>, <multiprocessing.queues.Queue object at 0x0000020CECE9B208>, <multiprocessing.queues.Queue object at 0x0000020CECE9B390>]
*** Process List ***
[<Process(Process-1, initial)>, <Process(Process-2, initial)>, <Process(Process-3, initial)>, <Process(Process-4, initial)>]
Start All Processes ...
val = 0
parent process: 5860
process id: 1912
val = 1
parent process: 5860
process id: 11068
val = 2
parent process: 5860
process id: 8248
val = 3
parent process: 5860
process id: 7176
Collecting Results
Verify Output
[[ 1.00e+00 -6.09e-06 -5.43e-08 -6.22e-05 -1.31e-05]
[-9.43e-06 1.00e+00 -3.60e-05 -1.59e-05 -2.85e-05]
[-4.36e-06 -8.55e-05 1.00e+00 -8.99e-06 -8.67e-05]
[-1.36e-05 -1.48e-05 -1.83e-06 1.00e+00 -9.68e-05]
[-7.75e-05 -8.54e-05 -4.44e-05 -3.25e-05 1.00e+00]]
Matrix Shapes
(5, 5)
(30, 30)
(30, 30)
(30, 30)