我编写了一个脚本,我使用mpi4py在python2.7中的Ubuntu 14.04 LTS机器上运行。这是从一开始的片段:
from mpi4py import MPI
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()
print comm.Get_size()
在我的旧电脑上,如果我然后运行mpiexec -n 3 python2.7 foo.py
,我会得到答案:
3
3
3
我最近开始将我的软件迁移到新的Ubuntu 14.04 LTS服务器。当我在那里运行相同的命令时,我得到答案:
1
1
1
显然这里出现了问题但我不知道在哪里看,因为我的MPI知识不足。我试图检查MPI版本并在旧计算机上运行mpiexec --version
返回:
HYDRA build details:
Version: 1.4.1p1
Release Date: Thu Sep 1 13:53:02 CDT 2011
CC: gcc
CXX: c++
F77: gfortran
F90: f95
Configure options: '--enable-shared' '--prefix=/opt/anaconda1anaconda2anaconda3' '--disable-option-checking' 'CC=gcc' 'CFLAGS= -O2' 'LDFLAGS= ' 'LIBS=-lrt -lpthread ' 'CPPFLAGS= -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpl/include -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpl/include -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/openpa/src -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/openpa/src -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/ch3/include -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/ch3/include -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/common/datatype -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/common/datatype -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/common/locks -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/common/locks -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/ch3/channels/nemesis/include -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/ch3/channels/nemesis/include -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/ch3/channels/nemesis/nemesis/include -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/ch3/channels/nemesis/nemesis/include -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/ch3/channels/nemesis/nemesis/utils/monitor -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/ch3/channels/nemesis/nemesis/utils/monitor -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/util/wrappers -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/util/wrappers'
Process Manager: pmi
Launchers available: ssh rsh fork slurm ll lsf sge manual persist
Topology libraries available: hwloc plpa
Resource management kernels available: user slurm ll lsf sge pbs
Checkpointing libraries available:
Demux engines available: poll select
如果我在新电脑上运行它,我会得到答案:
mpiexec (OpenRTE) 1.6.5
Report bugs to http://www.open-mpi.org/community/help/
我在这里运行可能导致问题的不同MPI实现吗?我该怎么说呢?或者是python端的问题?好像三个进程正在启动只是python还没有完全实现。我意识到后者可能是由mpi4py和mpiexec使用不同的MPI实现引起的。
如果我在任一台机器上运行which mpiexec
,它将返回:
/home/pmj27/anaconda2/bin/mpiexec
运行mpi4py.get_config()
会返回:
{'mpicxx': '/home/pmj27/anaconda2/bin/mpicxx', 'mpif77': '/home/pmj27/anaconda2/bin/mpif77', 'mpicc': '/home/pmj27/anaconda2/bin/mpicc', 'mpif90': '/home/pmj27/anaconda2/bin/mpif90'}