PGI编译器问题,MPI程序失败

时间:2019-07-10 10:12:15

标签: compilation mpi gpu pgi

我有一些简单的代码可以用GNU编译器很好地编译。然后我切换到PGI编译器。但是该程序将失败。我正在使用具有Xeon E5 CPU 16处理器和两张GPU卡,一张泰坦和一张1080的台式机进行编译。

我在以下简单的hello世界中进行了测试,

npm run eject

错误如下

#include <mpi.h>
#include <iostream>

using namespace std;

int main(int argc, char **argv){
        int procid, numprocs;
        MPI_Init(&argc, &argv);
        MPI_Comm_rank(MPI_COMM_WORLD, &procid);
        MPI_Comm_size(MPI_COMM_WORLD, &numprocs);

        cout << "hello world"<< endl;

        MPI_Finalize();
        return 0;
}

以下是mpic ++的结果

[WorkStation:14395] [[INVALID],INVALID] ORTE_ERROR_LOG: A system-required executable either could not be found or was not executable by this user in file ../../../../../orte/mca/ess/singleton/ess_singleton_module.c at line 388
[WorkStation:14395] [[INVALID],INVALID] ORTE_ERROR_LOG: A system-required executable either could not be found or was not executable by this user in file ../../../../../orte/mca/ess/singleton/ess_singleton_module.c at line 166
--------------------------------------------------------------------------
Sorry!  You were supposed to get help about:
    orte_init:startup:internal-failure
But I couldn't open the help file:
    /proj/pgi/linux86-64-llvm/2019/mpi/openmpi-3.1.3/share/openmpi/help-orte-runtime: No such file or directory.  Sorry!
--------------------------------------------------------------------------
--------------------------------------------------------------------------
Sorry!  You were supposed to get help about:
    mpi_init:startup:internal-failure
But I couldn't open the help file:
    /proj/pgi/linux86-64-llvm/2019/mpi/openmpi-3.1.3/share/openmpi/help-mpi-runtime.txt: No such file or directory.  Sorry!
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
[WorkStation:14395] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!

我在设置相同的笔记本电脑上尝试了相同的操作。编译良好,没有问题。我开始怀疑是多GPU引起了问题。由于每个GPU都有不同的PGI目标,因此我尝试了-ta = tesla:cc70和-ta = tesla:cc60。都不行。

我不知道如何调试它,如果需要更多信息,可以添加它。

0 个答案:

没有答案