尝试诊断MPI_Comm_Spawn存在的一些问题。当我运行简单的示例程序时:
#include "mpi.h"
#include <stdio.h>
#include <stdlib.h>
int main( int argc, char *argv[] )
{
int np[2] = { 1, 1 };
int errcodes[2];
MPI_Comm parentcomm, intercomm;
char *cmds[2] = { "spawn_example", "spawn_example" };
MPI_Info infos[2] = { MPI_INFO_NULL, MPI_INFO_NULL };
MPI_Init( &argc, &argv );
MPI_Comm_get_parent( &parentcomm );
if (parentcomm == MPI_COMM_NULL)
{
/* Create 2 more processes - this example must be called spawn_example.exe for this to work. */
MPI_Comm_spawn_multiple( 2, cmds, MPI_ARGVS_NULL, np, infos, 0, MPI_COMM_WORLD, &intercomm, errcodes );
printf("I'm the parent.\n");
}
else
{
printf("I'm the spawned.\n");
}
fflush(stdout);
MPI_Finalize();
return 0;
}
我得到了输出:
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
ompi_dpm_dyn_init() failed
--> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
我正在使用openmpi 3.1.1。我知道openmpi的早期版本中存在“ spawn”问题,但我认为此版本已解决该问题?有人知道还会发生什么吗?