MPI:执行MPI_Finalize()时出错

时间:2011-01-20 21:37:30

标签: mpi

这是我第一次在执行MPI_Finalize()时遇到错误。我认为沟通导致了这个问题,但我不知道是什么导致了这个问题。

当我在1个处理器上运行时,它工作正常但在2个或更多处理器上,我收到分段错误..

错误消息是

[seismicmstm:32604] *** Process received signal ***
[seismicmstm:32604] Signal: Segmentation fault (11)
[seismicmstm:32604] Signal code:  (128)
[seismicmstm:32604] Failing at address: (nil)
[seismicmstm:32604] [ 0] /lib64/libpthread.so.0 [0x311c60eb10]
[seismicmstm:32604] [ 1] /usr/local/openmpi-1.4.2/lib/libopen-pal.so.0(opal_memo ry_ptmalloc2_int_malloc+0x2f4) [0x2b6955551794]
[seismicmstm:32604] [ 2] /usr/local/openmpi-1.4.2/lib/libopen-pal.so.0 [0x2b6955 553543]
[seismicmstm:32604] [ 3] /lib64/libc.so.6(__libc_calloc+0x330) [0x311ba74bc0]
[seismicmstm:32604] [ 4] /lib64/ld-linux-x86-64.so.2 [0x311b609d65]
[seismicmstm:32604] [ 5] /lib64/ld-linux-x86-64.so.2 [0x311b605a9c]
[seismicmstm:32604] [ 6] /lib64/ld-linux-x86-64.so.2 [0x311b6076e1]
[seismicmstm:32604] [ 7] /lib64/ld-linux-x86-64.so.2 [0x311b610bb6]
[seismicmstm:32604] [ 8] /lib64/ld-linux-x86-64.so.2 [0x311b60ce06]
[seismicmstm:32604] [ 9] /lib64/ld-linux-x86-64.so.2 [0x311b6105bc]
[seismicmstm:32604] [10] /lib64/libc.so.6 [0x311bb08df0]
[seismicmstm:32604] [11] /lib64/ld-linux-x86-64.so.2 [0x311b60ce06]
[seismicmstm:32604] [12] /lib64/libc.so.6(__libc_dlopen_mode+0x47) [0x311bb08f57 ]
[seismicmstm:32604] [13] /lib64/libpthread.so.0 [0x311c60f1dc]
[seismicmstm:32604] [14] /lib64/libpthread.so.0 [0x311c60f2f0]
[seismicmstm:32604] [15] /lib64/libpthread.so.0(__pthread_unwind+0x40) [0x311c60 d160]
[seismicmstm:32604] [16] /lib64/libpthread.so.0 [0x311c607985]
[seismicmstm:32604] [17] /usr/local/openmpi-1.4.2/lib/openmpi/mca_btl_openib.so [0x2b695869d22b]
[seismicmstm:32604] [18] /lib64/libpthread.so.0 [0x311c60673d]
[seismicmstm:32604] [19] /lib64/libc.so.6(clone+0x6d) [0x311bad3f6d]
[seismicmstm:32604] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 32604 on node seismicmstm.cluster exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------

我在代码中所做的就是散布,收集和广播数据。 任何人都可以告诉我如何调试它....

1 个答案:

答案 0 :(得分:0)

有两个可能的原因: 1)您的MPI_Finalize错误:通过运行MPI分发中包含的CPI等示例代码来检查MPI库是否正常工作。如果您无法访问该发行版,则可以下载tar文件并提取CPI代码或从Web下载任何简单的Hello World应用程序。我强烈推荐http://www.citutor.org/如果示例代码有效,那么您的MPI库就可以了,代码错误。如果没有,则库无法正常工作。下载您选择的实现并编译另一个副本。

2)代码在MPI_Finalize中没有死(segfault),而是在MPI_Finalize之前的某个地方。你能否确认在MPI_Finalize中发生了段错误而不是之前?