我有一个与MPI并行化的应用程序,它被分成许多不同的任务。每个处理器仅分配一个任务,并且分配了相同任务的处理器组被分配它自己的通信器。任务需要定期同步。目前,同步是通过MPI_COMM_WORLD完成的,但其缺点是不能使用集合操作,因为不能保证其他任务将到达该代码块。
作为一个更具体的例子:
task1: equation1_solver, N nodes, communicator: mpi_comm_solver1
task2: equation2_solver, M nodes, communicator: mpi_comm_solver2
task3: file IO , 1 node , communicator: mpi_comm_io
我想在task1上使用MPI_SUM数组,并将结果显示在task3上。有没有一种有效的方法来做到这一点? (我很抱歉,如果这是一个愚蠢的问题,我没有太多创建和使用自定义MPI通信器的经验)
答案 0 :(得分:5)
查尔斯完全正确;互通者允许你在传播者之间进行交谈(或者,在这种背景下区分“正常”传播者,“内部传播者”,这并没有给我带来太大的改善)。
我总是发现使用这些内部通信器对那些刚接触它们的人来说有点混乱。不是那些有意义的基本思想,而是使用(例如)MPI_Reduce
与其中一个的机制。执行缩减的任务组指定远程通信器上的根级别,到目前为止一直很好;但是在远程排名通信器中,每个人不根指定MPI_PROC_NULL
作为root,而实际根指定MPI_ROOT
。人们为了向后兼容而做的事情,嘿?
#include <mpi.h>
#include <stdio.h>
int main(int argc, char **argv)
{
int commnum = 0; /* which of the 3 comms I belong to */
MPI_Comm mycomm; /* Communicator I belong to */
MPI_Comm intercomm; /* inter-communicator */
int cw_rank, cw_size; /* size, rank in MPI_COMM_WORLD */
int rank; /* rank in local communicator */
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &cw_rank);
MPI_Comm_size(MPI_COMM_WORLD, &cw_size);
if (cw_rank == cw_size-1) /* last task is IO task */
commnum = 2;
else {
if (cw_rank < (cw_size-1)/2)
commnum = 0;
else
commnum = 1;
}
printf("Rank %d in comm %d\n", cw_rank, commnum);
/* create the local communicator, mycomm */
MPI_Comm_split(MPI_COMM_WORLD, commnum, cw_rank, &mycomm);
const int lldr_tag = 1;
const int intercomm_tag = 2;
if (commnum == 0) {
/* comm 0 needs to communicate with comm 2. */
/* create an intercommunicator: */
/* rank 0 in our new communicator will be the "local leader"
* of this commuicator for the purpose of the intercommuniator */
int local_leader = 0;
/* Now, since we're not part of the other communicator (and vice
* versa) we have to refer to the "remote leader" in terms of its
* rank in COMM_WORLD. For us, that's easy; the remote leader
* in the IO comm is defined to be cw_size-1, because that's the
* only task in that comm. But for them, it's harder. So we'll
* send that task the id of our local leader. */
/* find out which rank in COMM_WORLD is the local leader */
MPI_Comm_rank(mycomm, &rank);
if (rank == 0)
MPI_Send(&cw_rank, 1, MPI_INT, cw_size-1, 1, MPI_COMM_WORLD);
/* now create the inter-communicator */
MPI_Intercomm_create( mycomm, local_leader,
MPI_COMM_WORLD, cw_size-1,
intercomm_tag, &intercomm);
}
else if (commnum == 2)
{
/* there's only one task in this comm */
int local_leader = 0;
int rmt_ldr;
MPI_Status s;
MPI_Recv(&rmt_ldr, 1, MPI_INT, MPI_ANY_SOURCE, lldr_tag, MPI_COMM_WORLD, &s);
MPI_Intercomm_create( mycomm, local_leader,
MPI_COMM_WORLD, rmt_ldr,
intercomm_tag, &intercomm);
}
/* now let's play with our communicators and make sure they work */
if (commnum == 0) {
int max_of_ranks = 0;
/* try it internally; */
MPI_Reduce(&rank, &max_of_ranks, 1, MPI_INT, MPI_MAX, 0, mycomm);
if (rank == 0) {
printf("Within comm 0: maximum of ranks is %d\n", max_of_ranks);
printf("Within comm 0: sum of ranks should be %d\n", max_of_ranks*(max_of_ranks+1)/2);
}
/* now try summing it to the other comm */
/* the "root" parameter here is the root in the remote group */
MPI_Reduce(&rank, &max_of_ranks, 1, MPI_INT, MPI_SUM, 0, intercomm);
}
if (commnum == 2) {
int sum_of_ranks = -999;
int rootproc;
/* get reduction data from other comm */
if (rank == 0) /* am I the root of this reduce? */
rootproc = MPI_ROOT;
else
rootproc = MPI_PROC_NULL;
MPI_Reduce(&rank, &sum_of_ranks, 1, MPI_INT, MPI_SUM, rootproc, intercomm);
if (rank == 0)
printf("From comm 2: sum of ranks is %d\n", sum_of_ranks);
}
if (commnum == 0 || commnum == 2);
MPI_Comm_free(&intercomm);
MPI_Finalize();
}
答案 1 :(得分:4)
您所需要的只是创建一个新的通信器,其中包含您希望一起通信的任务中的节点。看看MPI Groups和Communicators。您可以在网上找到许多示例here for instance。