这是我的脚本和python代码。
$ cat go
while true
do
echo "------->"
python3 -m mpi4py ./go.py
echo "<------"
done
此代码在循环中运行python go.py。
$ cat go.py
import mpi4py.MPI as MPI
print( "######", MPI.Is_initialized())
comm = MPI.COMM_WORLD
comm_rank = comm.Get_rank()
comm_size = comm.Get_size()
# point to point communication
data_send = [comm_rank]*5
comm.send(data_send,dest=(comm_rank+1)%comm_size)
data_recv =comm.recv(source=(comm_rank-1)%comm_size)
print("my rank is %d, and Ireceived:" % comm_rank)
print( data_recv )
MPI.Finalize()
print( "######", MPI.Is_finalized())
这个python代码只是打印。
运行此go脚本后,go.py执行并退出,当go.py再次执行时, 它被卡住了。
$ mpirun --mca orte_base_help_aggregate 0 -np 2 sh ./go
------->
------->
--------------------------------------------------------------------------
[[27909,1],1]: A high-performance Open MPI point-to-point messaging module
was unable to find any relevant network interfaces:
Module: OpenFabrics (openib)
Host: myvm20
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
[[27909,1],0]: A high-performance Open MPI point-to-point messaging module
was unable to find any relevant network interfaces:
Module: OpenFabrics (openib)
Host: myvm20
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
###### True
###### True
my rank is 0, and Ireceived:
[1, 1, 1, 1, 1]
my rank is 1, and Ireceived:
[0, 0, 0, 0, 0]
###### True
###### True
<------
------->
<------
------->
--------------------------------------------------------------------------
[[27909,1],0]: A high-performance Open MPI point-to-point messaging module
was unable to find any relevant network interfaces:
Module: OpenFabrics (openib)
Host: myvm20
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
[[27909,1],1]: A high-performance Open MPI point-to-point messaging module
was unable to find any relevant network interfaces:
Module: OpenFabrics (openib)
Host: myvm20
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
永远冻结。
为什么会卡住,我该如何继续使用此脚本?
顺便说一句: 我有两种工作A / B运行,工作A持续,工作B完成并退出。所以我不能按照以下方式运行它:
while true
do
echo "------->"
mpirun -np 2 A : -np 2 B
echo "<------"
done
还有其他办法吗?
答案 0 :(得分:0)
长话短说,你做不到。
这是你应该做的事情
{{1}}