我有一个小的测试代码,充当任务场。我们的想法是将任务列表发送到由mpi4py生成的一组进程,这些进程将更改为运行目录,并在返回之前在该目录中生成mpi-executable,并检索新任务。
问题是mpi-executable的执行似乎发生在运行原始程序的顶级目录中。
主代码在/ top / level / folder /
中执行并包含一系列任务,例如:[0,1,2,3,4,5,6,7,8,9,10]。每个从服务器在执行mpi-executable之前接收任务并更改为同名目录,然后再更改。
主码:
#!/usr/bin/env python
from mpi4py import MPI
import numpy as np
import sys
import os
import time
comm = MPI.COMM_WORLD
rank = MPI.COMM_WORLD.Get_rank()
processes=4
tasks=([StopIteration] * (processes))+[0,1,2,3,4,5,6,7,8,9,10]
new_comm=comm.Spawn("/path/to/slave/slave.py",
args=[],maxprocs=processes)
status=MPI.Status()
while tasks:
new_comm.recv(source=MPI.ANY_SOURCE, status=status)
data=tasks.pop()
print("on master received source: ",status.Get_source())
print("On master sending: ",data," to:",status.Get_source())
new_comm.send(obj=data,dest=status.Get_source())
print("On master sent: ",data," to:",status.Get_source())
print("Finished All",rank)
new_comm.Barrier()
print("after barrier",rank)
print("rank", rank,"task",tasks)
new_comm.Disconnect()
和slave.py代码:
#!/usr/bin/env python
from mpi4py import MPI
import numpy as np
import sys
import os
import time
comm = MPI.Comm.Get_parent()
rank = comm.Get_rank()
cwd=os.getcwd()
print("slave", rank," entering loop")
for task in iter(lambda: comm.sendrecv(dest=0), StopIteration):
print("slave ", rank," recvd data", task)
print("slave ", rank," going to sleep")
directory=os.path.join(cwd,str(task))
os.chdir(directory)
new_comm=MPI.COMM_SELF.Spawn("/path/to/some/mpi-executable",
args=[],maxprocs=4)
os.chdir(cwd)
new_comm.Barrier()
new_comm.Free()
comm.Barrier()
comm.Disconnect()
但mpi-executable的每个实例都试图在/ top / level / folder /
中启动任何关于为什么会发生这种行为的想法都会受到赞赏!
答案 0 :(得分:2)
MPI_COMM_SPAWN
,构建MPI.Comm.Spawn
的MPI操作采用MPI_INFO
对象,该对象可用于提供其他特定于实现的信息。该参数也可以在mpi4py
中作为命名的info
参数提供。
info = MPI.Info.Create()
info.Set('key', 'value')
MPI.Comm.Spawn(..., info=info, ...)
对于许多现有的MPI实现,用于设置子进程的工作目录的info键是wdir
。