我想运行15个命令,但一次只运行3个
testme.py
import multiprocessing
import time
import random
import subprocess
def popen_wrapper(i):
p = subprocess.Popen( ['echo', 'hi'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout, stderr = p.communicate()
print stdout
time.sleep(randomint(5,20)) #pretend it's doing some work
return p.returncode
num_to_run = 15
max_parallel = 3
running = []
for i in range(num_to_run):
p = multiprocessing.Process(target=popen_wrapper, args=(i,))
running.append(p)
p.start()
if len(running) >= max_parallel:
# blocking wait - join on whoever finishes first then continue
else:
# nonblocking wait- see if any processes is finished. If so, join the finished processes
我不确定如何实施以下评论:
if len(running) >= max_parallel:
# blocking wait - join on whoever finishes first then continue
else:
# nonblocking wait- see if any processes is finished. If so, join the finished processes
我无法做到这样的事情:
for p in running:
p.join()
因为第二个进程已经完成,但我仍然在第一个进程被阻止。
问题:你如何检查running
中的进程是否在阻塞和非阻塞中完成(找到第一个完成的)?
寻找类似waitpid的东西,也许
答案 0 :(得分:4)
也许最简单的方法是使用multiprocessing.Pool:
pool = mp.Pool(3)
将设置一个包含3个工作进程的池。然后,您可以将15个任务发送到池中:
for i in range(num_to_run):
pool.apply_async(popen_wrapper, args=(i,), callback=log_result)
协调3名工人和15项任务所需的所有机器是
由mp.Pool
处理。
使用mp.Pool :
import multiprocessing as mp
import time
import random
import subprocess
import logging
logger = mp.log_to_stderr(logging.WARN)
def popen_wrapper(i):
logger.warn('echo "hi"')
return i
def log_result(retval):
results.append(retval)
if __name__ == '__main__':
num_to_run = 15
max_parallel = 3
results = []
pool = mp.Pool(max_parallel)
for i in range(num_to_run):
pool.apply_async(popen_wrapper, args=(i,), callback=log_result)
pool.close()
pool.join()
logger.warn(results)
产量
[WARNING/PoolWorker-1] echo "hi"
[WARNING/PoolWorker-3] echo "hi"
[WARNING/PoolWorker-1] echo "hi"
[WARNING/PoolWorker-1] echo "hi"
[WARNING/PoolWorker-3] echo "hi"
[WARNING/PoolWorker-1] echo "hi"
[WARNING/PoolWorker-3] echo "hi"
[WARNING/PoolWorker-1] echo "hi"
[WARNING/PoolWorker-3] echo "hi"
[WARNING/PoolWorker-1] echo "hi"
[WARNING/PoolWorker-3] echo "hi"
[WARNING/PoolWorker-1] echo "hi"
[WARNING/PoolWorker-1] echo "hi"
[WARNING/PoolWorker-3] echo "hi"
[WARNING/PoolWorker-2] echo "hi"
[WARNING/MainProcess] [0, 2, 3, 5, 4, 6, 7, 8, 9, 10, 11, 12, 14, 13, 1]
日志记录语句显示哪个PoolWorker处理每个任务,最后一个日志记录语句显示MainProcess已从15个popen_wrapper
调用中收到返回值。
如果您不想使用资源池,可以为任务设置mp.Queue
,为返回值设置mp.Queue
:
使用mp.Process
和mp.Queue
s :
import multiprocessing as mp
import time
import random
import subprocess
import logging
logger = mp.log_to_stderr(logging.WARN)
SENTINEL = None
def popen_wrapper(inqueue, outqueue):
for i in iter(inqueue.get, SENTINEL):
logger.warn('echo "hi"')
outqueue.put(i)
if __name__ == '__main__':
num_to_run = 15
max_parallel = 3
inqueue = mp.Queue()
outqueue = mp.Queue()
procs = [mp.Process(target=popen_wrapper, args=(inqueue, outqueue))
for i in range(max_parallel)]
for p in procs:
p.start()
for i in range(num_to_run):
inqueue.put(i)
for i in range(max_parallel):
# Put sentinels in the queue to tell `popen_wrapper` to quit
inqueue.put(SENTINEL)
for p in procs:
p.join()
results = [outqueue.get() for i in range(num_to_run)]
logger.warn(results)
请注意,如果您使用
procs = [mp.Process(target=popen_wrapper, args=(inqueue, outqueue))
for i in range(max_parallel)]
然后您强制执行max_parallel
个(例如3个)工作进程。然后,您将所有15个任务发送到一个队列:
for i in range(num_to_run):
inqueue.put(i)
让工作人员处理拉出任务的任务:
def popen_wrapper(inqueue, outqueue):
for i in iter(inqueue.get, SENTINEL):
logger.warn('echo "hi"')
outqueue.put(i)
您可能还会感兴趣Doug Hellman's multiprocessing tutorial。在许多有用的示例中,您会发现an ActivePool
recipe显示如何生成10个进程并限制它们(使用mp.Semaphore
),以便在任何给定时间只有3个处于活动状态。虽然这可能是有益的,但它可能不是您情况下的最佳解决方案,因为似乎没有理由为什么您想要产生超过3个进程。