我可以和你们联系,看看我的概念在使用Python的多处理并行执行exe文件时是否存在根本错误。
所以我有大量的工作(示例代码中有100000个),我想使用所有可用的核心(我的计算机中有16个)来并行运行它们。下面的代码没有像我看到的许多例子一样使用Queue,但似乎有效。只是想避免代码“工作”的情况,但是当我将其扩展到运行多个计算节点时,存在一个巨大的错误,等待爆炸。有人可以帮忙吗?
import subprocess
import multiprocessing
def task_fn(task_dir) :
cmd_str = ["my_exe","-my_exe_arguments"]
try :
msg = subprocess.check_output(cmd_str,cwd=task_dir,stderr=subprocess.STDOUT,universal_newlines=True)
except subprocess.CalledProcessError as e :
with open("a_unique_err_log_file.log","w") as f :
f.write(e.output)
return;
if __name__ == "__main__":
n_cpu = multiprocessing.cpu_count()
num_jobs = 100000
proc_list = [multiprocessing.Process() for p in range(n_cpu)]
for i in range(num_jobs):
task_dir = str(i)
task_processed = False
while not(task_processed) :
# Search through all processes in p_list repeatedly until a
# terminated processs is found to take on a new task
for p in range(len(p_list)) :
if not(p_list[p].is_alive()) :
p_list[p] = multiprocessing.Process(target=task_fn,args=(task_dir,))
p_list[p].start()
task_processed = True
# At the end of the outermost for loop
# Wait until all the processes have finished
for p in p_list :
p.join()
print("All Done!")
答案 0 :(得分:1)
而不是自己产生和管理流程,而是使用Pool of workers。它旨在为您处理所有这些。
当您的工作人员正在生成子流程时,您可以使用线程而不是流程。
此外,工作人员似乎会在同一个文件上写字。您需要保护其对并发实例的访问权限,否则结果将完全失灵。
from threading import Lock
from concurrent.futures import ThreadPoolExecutor
mutex = Lock()
task_dir = "/tmp/tasks"
def task_fn(task_nr):
"""This function will run in a separate thread."""
cmd_str = ["my_exe","-my_exe_arguments"]
try:
msg = subprocess.check_output(cmd_str, cwd=task_dir, stderr=subprocess.STDOUT, universal_newlines=True)
except subprocess.CalledProcessError as e:
with mutex:
with open("a_unique_PROTECTED_err_log_file.log", "w") as f :
f.write(e.output)
return task_nr
with ThreadPoolExecutor() as pool:
iterator = pool.map(task_fn, range(100000))
for result in iterator:
print("Task %d done" % result)