使用openmpi时无法修改队列对的属性

时间:2017-04-05 07:36:53

标签: python mpi

今天我在调用mpirun时遇到了一个mpi异常,这是以前从未见过的。例外是:

Failed to modify the attributes of a queue pair (QP):

Hostname: nmyjs_104_22
Mask for QP attributes to be modified: 113
Error:    Invalid argument
--------------------------------------------------------------------------
--------------------------------------------------------------------------
Open MPI has detected that there are UD-capable Verbs devices on your
system, but none of them were able to be setup properly.  This may
indicate a problem on this system.

You job will continue, but Open MPI will ignore the "ud" oob component
in this run.

Hostname: nmyjs_104_22
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun could not find anything to do.

It is possible that you forgot to specify how many processes to run
via the "-np" argument.
--------------------------------------------------------------------------

输出似乎意味着openmpi无论如何都会运行。但是在这种情况下,即使只是简单的命令,我的代码也会失败。

例如,mpirun -np 1 echo "hello world",例外是:

[@nmyjs_104_22 ~]$ mpirun -np 1 echo "hello"
--------------------------------------------------------------------------
Failed to modify the attributes of a queue pair (QP):

Hostname: nmyjs_104_22
Mask for QP attributes to be modified: 113
Error:    Invalid argument
--------------------------------------------------------------------------
--------------------------------------------------------------------------
Open MPI has detected that there are UD-capable Verbs devices on your
system, but none of them were able to be setup properly.  This may
indicate a problem on this system.

You job will continue, but Open MPI will ignore the "ud" oob component
in this run.

Hostname: nmyjs_104_22
--------------------------------------------------------------------------
[nmyjs_104_22:21719] *** Process received signal ***
[nmyjs_104_22:21719] Signal: Segmentation fault (11)
[nmyjs_104_22:21719] Signal code: Address not mapped (1)
[nmyjs_104_22:21719] Failing at address: 0x100000007
[nmyjs_104_22:21719] [ 0] /lib64/libpthread.so.0[0x3da4a0f710]
[nmyjs_104_22:21719] [ 1] /usr/local/lib/libopen-rte.so.20(orte_rml_send_callback+0x13)[0x7f59b87904e3]
[nmyjs_104_22:21719] [ 2] /usr/local/lib/openmpi/mca_rml_oob.so(+0x1e34)[0x7f59b4ccee34]
[nmyjs_104_22:21719] [ 3] /usr/local/lib/libopen-pal.so.20(opal_libevent2022_event_base_loop+0xbf1)[0x7f59b84a7ec1]
[nmyjs_104_22:21719] [ 4] mpirun[0x404b41]
[nmyjs_104_22:21719] [ 5] mpirun[0x403466]
[nmyjs_104_22:21719] [ 6] /lib64/libc.so.6(__libc_start_main+0xfd)[0x3da461ed5d]
[nmyjs_104_22:21719] [ 7] mpirun[0x403359]
[nmyjs_104_22:21719] *** End of error message ***
我用Google搜索但没有运气。我今天所做的是将openmpi从2.0.2更新到2.1.0,位于this page。我还将openmpi恢复为2.0.2,仍然有这个错误。希望有人可以提供帮助。

谢谢

0 个答案:

没有答案