用服务器上的openmpi编译mpi4py后,我遇到运行时错误。
OS: SuSe
GCC: 4.8.5
OpenMPI: 1.10.1
HDF5: 1.8.11
mpi4py: 2.0.0
Python: 2.7.9
环境设置: 我使用virtualenv(没有服务器的管理员权限)
(ENV) username@servername:~/test> echo $PATH
/opt/local/tools/hdf5/hdf5-1.8.11_openmpi-1.10.1_gcc-4.8.5/bin:/opt/local/mpi/openmpi/openmpi-1.10.1_gcc-4.8.5/bin:/home/username/test/virtualenv-15.0.3/ENV/bin: [other libs ] :/opt/local/bin:/usr/lib64/mpi/gcc/openmpi/bin:/usr/local/bin:/usr/bin:/bin
(ENV) username@servername:echo $LD_LIBRARY_PATH
/opt/local/tools/hdf5/hdf5-1.8.11_openmpi-1.10.1_gcc-4.8.5/lib:/opt/local/mpi/openmpi/openmpi-1.10.1_gcc-4.8.5/lib
(ENV) username@servername:~/test> pip freeze
cycler==0.10.0
Cython==0.24.1
dill==0.2.5
matplotlib==1.5.3
multiprocessing==2.6.2.1
numpy==1.11.1
pyfits==3.4
pyparsing==2.1.9
python-dateutil==2.5.3
pytz==2016.6.1
scipy==0.18.1
six==1.10.0
编译并安装mpi4py:
(ENV) username@servername:~/test> wget https://bitbucket.org/mpi4py/mpi4py/downloads/mpi4py-2.0.0.tar.gz
(ENV) username@servername:~/test> tar xzvf mpi4py-2.0.0.tar.gz
(ENV) username@servername:~/test> cd mpi4py-2.0.0/
(ENV) username@servername:~/test>vim mpi.cfg
在mpi.cfg中,我为自定义Open MPI添加了一个部分:
[mpi]
mpi_dir = /opt/local/mpi/openmpi/openmpi-1.10.1_gcc-4.8.5
mpicc = %(mpi_dir)s/bin/mpicc
mpicxx = %(mpi_dir)s/bin/mpicxx
library_dirs = %(mpi_dir)s/lib
runtime_library_dirs = %(library_dirs)s
编译
(ENV) username@servername:python setup.py build --mpi=mpi
安装
(ENV) username@servername:python setup.py install
第一次基本测试(ok)
(ENV) username@servername: mpiexec -n 5 python -m mpi4py helloworld
Hello, World! I am process 0 of 5 on servername.
Hello, World! I am process 1 of 5 on servername.
Hello, World! I am process 2 of 5 on servername.
Hello, World! I am process 3 of 5 on servername.
Hello, World! I am process 4 of 5 on servername.
第二次基本测试会产生错误:
(ENV) username@servername: python
>>>from mpi4py import MPI
--------------------------------------------------------------------------
Error obtaining unique transport key from ORTE orte_precondition_transports not present in the environment).
Local host: servername
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during MPI_INIT; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer):
PML add procs failed
--> Returned "Error" (-1) instead of "Success" (0)
-------------------------------------------------------------------------
*** An error occurred in MPI_Init_thread
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[servername:165332] Local abort before MPI_INIT completed successfully; not able to aggregate error messages, and not able to guarantee that all other processes were killed!
(ENV) username@servername:~/test/mpi4py-2.0.0>
更新:在编译mpi4py期间我收到此错误
checking for library 'lmpe' ...
/opt/local/mpi/openmpi/openmpi-1.10.1_gcc-4.8.5/bin/mpicc -pthread
-fno-strict-aliasing -fmessage-length=0 -grecord-gcc-switches -fstack-
protector -O2 -Wall -D_FORTIFY_SOURCE=2 -funwind-tables -fasynchronous-
unwind-tables -g -DNDEBUG -fmessage-length=0 -grecord-gcc-switches
-fstack-protector -O2 -Wall -D_FORTIFY_SOURCE=2 -funwind-tables
-fasynchronous-unwind tables -g -DOPENSSL_LOAD_CONF -fPIC -I/opt/local
/mpi/openmpi/openmpi-1.10.1_gcc-4.8.5/include -c _configtest.c -o
_configtest.o
/opt/local/mpi/openmpi/openmpi-1.10.1_gcc-4.8.5/bin/mpicc -pthread _configtest.o -L/opt/local/mpi/openmpi/openmpi-1.10.1_gcc-4.8.5/lib -Wl,-R/opt/local/mpi/openmpi/openmpi-1.10.1_gcc-4.8.5/lib -llmpe -o _configtest
/usr/lib64/gcc/x86_64-suse-linux/4.8/../../../../x86_64-suse-linux
bin/ld: cannot find -llmpe
collect2: error: ld returned 1 exit status
failure.
答案 0 :(得分:0)
请参阅:https://bitbucket.org/mpi4py/mpi4py/issues/52/mpi4py-compilation-error
似乎问题不是因为mpi4py错误而是来自OpenMPI的psm传输层
在我的案例设置中
export OMPI_MCA_mtl=^psm
解决了上述运行时错误。