mpirun命令中的错误

时间:2012-03-15 04:22:32

标签: mpi

       --------------------------------------------------------------------------
        MPI_ABORT was invoked on rank 2 in communicator MPI_COMM_WORLD 
        with errorcode 1.

        NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
         You may or may not see output from other processes, depending on
         exactly when Open MPI kills them.
           --------------------------------------------------------------------------
          --------------------------------------------------------------------------
        mpirun has exited due to process rank 2 with PID 19175 on
         node mosura15 exiting without calling "finalize". This may
            have caused other processes in the application to be
            terminated by signals sent by mpirun (as reported here).

我正在进行模拟。在MPI命令中,我发现了上述错误。这背后的原因是什么?我该如何解决这个问题?

2 个答案:

答案 0 :(得分:3)

看起来你的程序的第3个实例(id 2)崩溃并且没有调用MPI_Finalize()关闭,所以mpirun也关闭了程序的所有其他副本。是否存在导致该特定节点崩溃的问题,或者每次都是不同的节点?

答案 1 :(得分:3)

信息很清楚;等级2称为MPI_Abort(),它停止整个程序。您应该能够查看代码并找出程序调用MPI_Abort()的错误条件。