我最近使用找到的here教程构建了一个四个Pi3群集。
可用的测试都运行良好,所有节点都按照最终描述的方式运行。
我开始关注教程here。
我使用的Python代码是:
from mpi4py import MPI
comm = MPI.COMM_WORLD
size=comm.Get_size()
rank-comm.Get_rank()
if rank == 0:
data = [(x+1) ** x for x in range (size)]
print 'scattering data',data
else:
data = None
data = comm.scatter(data,root=0)
print 'rank',rank,'has data: ', data
我的机器文件如下所示:
node1:4
node2:4
node3:4
node4:4
如果我使用以下命令执行文件:
mpiexec -f machinefile -n 4 python scatter.py
我得到了输出:
pi@node1:~/cloud $ mpiexec -f machinefile -n 4 python scatter1.py
scattering data [1, 2, 9, 64]
rank 0 has data: 1
rank 1 has data: 2
rank 2 has data: 9
rank 3 has data: 64
但是,如果我尝试更高级别的4级,那么:
pi@node1:~/cloud $ mpiexec -f machinefile -n 5 python scatter1.py
scattering data [1, 2, 9, 64, 625]
rankrank 2 has data: 9
rank 3 has data: 64 1 has data: Traceback (most recent call last):
File "scatter1.py", line 12, in <module>
2
data = comm.scatter(data,root=0)
File "MPI/Comm.pyx", line 1286, in mpi4py.MPI.Comm.scatter (src/mpi4py.MPI.c:109079)
File "MPI/msgpickle.pxi", line 713, in mpi4py.MPI.PyMPI_scatter (src/mpi4py.MPI.c:48214)
mpi4py.MPI.Exception: Unknown error class, error stack:
PMPI_Scatterv(386).........: MPI_Scatterv(sbuf=0x76a32694, scnts=0x1659de0, displs=0x1656750, MPI_BYTE, rbuf=0x76a354f4, rcount=5, MPI_BYTE, root=0, MPI_COMM_WORLD) failed
MPIR_Scatterv_impl(184)....:
MPIR_Scatterv(108).........:
MPIC_Isend(649)............:
MPIDI_EagerContigIsend(573): failure occurred while attempting to send an eager message
MPIDI_CH3_iSendv(34).......: Communication error with rank 4
^C[mpiexec@node1] Sending Ctrl-C to processes as requested
[mpiexec@node1] Press Ctrl-C again to force abort
[mpiexec@node1] HYDU_sock_write (/home/pi/mpich2/mpich-3.2/src/pm/hydra/utils/sock/sock.c:286): write error (Bad file descriptor)
[mpiexec@node1] HYD_pmcd_pmiserv_send_signal (/home/pi/mpich2/mpich-3.2/src/pm/hydra/pm/pmiserv/pmiserv_cb.c:169): unable to write data to proxy
[mpiexec@node1] ui_cmd_cb (/home/pi/mpich2/mpich-3.2/src/pm/hydra/pm/pmiserv/pmiserv_pmci.c:79): unable to send signal downstream
[mpiexec@node1] HYDT_dmxu_poll_wait_for_event (/home/pi/mpich2/mpich-3.2/src/pm/hydra/tools/demux/demux_poll.c:76): callback returned error status
[mpiexec@node1] HYD_pmci_wait_for_completion (/home/pi/mpich2/mpich-3.2/src/pm/hydra/pm/pmiserv/pmiserv_pmci.c:198): error waiting for event
[mpiexec@node1] main (/home/pi/mpich2/mpich-3.2/src/pm/hydra/ui/mpich/mpiexec.c:344): process manager error waiting for completion
有人可以帮忙吗?