无法在多台计算机上运行mpi4py程序

时间:2020-06-25 07:11:23

标签: python python-3.x mpi nfs mpi4py

我有一个非常简单的mpi python程序

from mpi4py import MPI

comm = MPI.COMM_WORLD
rank = comm.rank
name = MPI.Get_processor_name()

print('name: ', name, ' rank: ', rank)

MPI.Finalize

我已经在主机上安装了nfs-kernel-server,在客户端计算机上安装了nfs-common。我遵循了此页面here

上的说明

现在我使用以下命令执行我的python mpi程序:

mpirun --hostfile myhostfile.txt -np 8 python hello.py

这样做时,出现以下错误:

[mahmoud-desktop:05540] *** Process received signal ***
[mahmoud-desktop:05540] Signal: Segmentation fault (11)
[mahmoud-desktop:05540] Signal code:  (128)
[mahmoud-desktop:05540] Failing at address: (nil)
[mahmoud-desktop:05540] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x12890)[0x7f5a89bf0890]
[mahmoud-desktop:05540] [ 1] /lib/x86_64-linux-gnu/libc.so.6(cfree+0x3d)[0x7f5a8988498d]
[mahmoud-desktop:05540] [ 2] /usr/lib/x86_64-linux-gnu/libopen-pal.so.20(opal_argv_free+0x29)[0x7f5a89e4b519]
[mahmoud-desktop:05540] [ 3] /usr/lib/x86_64-linux-gnu/libopen-rte.so.20(+0x283cb)[0x7f5a8a0d73cb]
[mahmoud-desktop:05540] [ 4] /usr/lib/x86_64-linux-gnu/libopen-rte.so.20(orte_util_add_hostfile_nodes+0xc1)[0x7f5a8a0d83f1]
[mahmoud-desktop:05540] [ 5] /usr/lib/x86_64-linux-gnu/libopen-rte.so.20(orte_ras_base_allocate+0xd3d)[0x7f5a8a1097fd]
[mahmoud-desktop:05540] [ 6] /usr/lib/x86_64-linux-gnu/libopen-pal.so.20(opal_libevent2022_event_base_loop+0xdc9)[0x7f5a89e63209]
[mahmoud-desktop:05540] [ 7] mpirun(+0x74a3)[0x55a0de7394a3]
[mahmoud-desktop:05540] [ 8] mpirun(+0x5aea)[0x55a0de737aea]
[mahmoud-desktop:05540] [ 9] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7f5a8980eb97]
[mahmoud-desktop:05540] [10] mpirun(+0x59ea)[0x55a0de7379ea]
[mahmoud-desktop:05540] *** End of error message ***

分段错误(核心已转储)

问题:如何解决此错误

0 个答案:

没有答案