mpiexec因MPI init中止而失败

时间:2011-08-10 13:25:48

标签: mpich

我正在尝试在运行Ubuntu 11.04(Natty Narwhal)的64位计算机上安装MPICH 2。我用了

sudo apt-get install mpich2

首先,我惊讶地发现没有安装mpd。在查看Google时,我看到Hydra是新的默认包管理器。 所以我试着运行我的MPI代码。我收到了以下错误。

> -------------------------------------------------------------------------------------------
> [ip-10-99-75-58:02212] [[INVALID],INVALID] ORTE_ERROR_LOG: A
> system-required executable either could not be found or was not
> executable by this user in file
> ../../../../../../orte/mca/ess/singleton/ess_singleton_module.c at
> line 357 [ip-10-99-75-58:02212] [[INVALID],INVALID] ORTE_ERROR_LOG: A
> system-required executable either could not be found or was not
> executable by this user in file
> ../../../../../../orte/mca/ess/singleton/ess_singleton_module.c at
> line 230 [ip-10-99-75-58:02212] [[INVALID],INVALID] ORTE_ERROR_LOG: A
> system-required executable either could not be found or was not
> executable by this user in file ../../../orte/runtime/orte_init.c at
> line 132
> --------------------------------------------------------------------------
> It looks like orte_init failed for some reason; your parallel process
> is likely to abort.  There are many reasons that a parallel process
> can fail during orte_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
> 
>   orte_ess_set_name failed   --> Returned value A system-required
> executable either could not be found or was not executable by this
> user (-127) instead of ORTE_SUCCESS
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> It looks like MPI_INIT failed for some reason; your parallel process
> is likely to abort.  There are many reasons that a parallel process
> can fail during MPI_INIT; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
> 
>   ompi_mpi_init: orte_init failed   --> Returned "A system-required
> executable either could not be found or was not executable by this
> user" (-127) instead of "Success" (0)
> --------------------------------------------------------------------------
> *** The MPI_Init() function was called before MPI_INIT was invoked.
> *** This is disallowed by the MPI standard.
> *** Your MPI job will now abort.
> -------------------------------------------------------------------------------------------

首先,它看起来像Open MPI错误。但是我安装了MPICH 2而不是Open MPI。

其次,我正在解决如何处理这个问题,因为所有帮助似乎都是针对Open MPI用户的。我错过了什么吗?

3 个答案:

答案 0 :(得分:11)

我在Ubuntu 12.04上遇到同样的问题。我发现我的问题是因为我的计算机上有open-mpi和mpich2。当我使用mpicc编译我的程序时,它将链接到open-mpi而不是mpich2。要解决此问题,可以使用“mpicc.mpich2”编译程序,然后使用“mpiexec.mpich2”执行代码。

答案 1 :(得分:2)

实际上,这些错误消息都是Open MPI错误。出于某种原因,您似乎也在某处安装了(配置错误的?)Open MPI副本。通过运行mpiexec键入which mpiexec,可以检查您正在执行的特定文件。我相信你可以将它与以下结果进行比较:

dpkg --listfiles mpich2

(或类似)以确定MPICH2包的安装位置。

答案 2 :(得分:0)

我发生了这件事,我发现了这个问题。在启动期间系统上某处将LD_PRELOAD设置为指向OpenMPI中的libmpi.so。

示例:

export LD_PRELOAD=<some_directory>/openmpi/1.4.4/lib/libmpi.so

结果是MPICH2失败。只需在运行MPICH2之前“取消设置LD_PRELOAD”,问题就会消失。

请注意,实际上有时需要将LD_PRELOAD设置为OpenMPI的libmpi.so才能使OpenMPI正常工作,因此取消设置可能会破坏OpenMPI。如果您需要使用OpenMPI,请记住重置它。