我有以下带有Microsoft MPI的c / c ++代码
#include <stdio.h>
#include <stdlib.h>
#include "mpi.h"
int main (int argc, char *argv[])
{
int err, numtasks, taskid;
int out=0,val;
MPI_Status status;
MPI_Request req;
err=MPI_Init(&argc, &argv);
err=MPI_Comm_size(MPI_COMM_WORLD, &numtasks);
err=MPI_Comm_rank(MPI_COMM_WORLD, &taskid);
int receiver=(taskid+1)% numtasks;
int sender= (taskid-1+numtasks)% numtasks;
printf("sender %d, receiver %d, rank %d\n",sender,receiver, taskid);
val=50;
MPI_Isend(&val, 1, MPI_INT, receiver, 1, MPI_COMM_WORLD, &req);
MPI_Irecv(&out, 1, MPI_INT, sender, 1, MPI_COMM_WORLD, &req);
printf ("Rank: %d , Value: %d\n", taskid, out );
err=MPI_Finalize();
return 0;
}
如果使用超过2个进程启动,应用程序将处于死锁状态。 使用2个进程,应用程序可以正常工作,但不会执行“out”。 这段代码适用于linux mpi发行版,问题似乎只出现在微软版本中。有什么帮助吗?
答案 0 :(得分:3)
首先,每个MPI进程正在执行两个通信:单个发送和单个接收。因此,您需要存储两个请求(MPI_Request req[2]
)和两个状态检查(MPI_Status status[2]
)。
其次,您需要在调用非阻塞发送/接收后等待,以确保它们正确完成。
MPI_Isend(&val, 1, MPI_INT, receiver, 1, MPI_COMM_WORLD, &req[0]);
MPI_Irecv(&out, 1, MPI_INT, sender, 1, MPI_COMM_WORLD, &req[1]);
// While the communication is happening, here you can overlap computation
// on data that is NOT being currently communicated, with the communication of val/out
MPI_Waitall(2, req, status);
// Now both the send and receive have been finished for this process,
// and we can access out, assured that it is valid
printf ("Rank: %d , Value: %d\n", taskid, out);
至于为什么这适用于Linux发行版,而不是微软发行版...我只能假设Linux实现有效地实现了非阻塞通信作为阻止通信。也就是说,他们“欺骗”并在完成之前完成你的沟通。这使得它们更容易,因为它们不必跟踪有关通信的信息,但它也会破坏您重叠计算和通信的能力。你不应该依赖它来工作。