我正在开发一个应用程序,其中每个处理器使用MPI_Isend
将一堆消息发送到其他处理器,然后接收一些未知数量的消息。
在我的小样本程序(下面的代码中)我有4个处理器向其余3个处理器中的每一个发送4条消息,因此每个处理器应该接收12条消息。
我遇到的问题是,当我的机器开心时,我的程序会输出以下内容
Rank 2 receives 12 msgs; Global count = 48
Rank 1 receives 12 msgs; Global count = 48
Rank 3 receives 12 msgs; Global count = 48
Rank 0 receives 12 msgs; Global count = 48
但有一段时间,某些处理器根本没有收到足够的消息:
Rank 1 receives 9 msgs; Global count = 37
Rank 3 receives 12 msgs; Global count = 37
Rank 2 receives 4 msgs; Global count = 37
Rank 0 receives 12 msgs; Global count = 37
我知道问题可能出在while-loop
,我使用MPI_Iprobe
来检查传入的消息,并在检查返回false后立即退出循环。
但我不知道我怎么能以不同的方式做到这一点。换句话说,如何确保所有处理器在到达MPI_Allreduce
语句时收到他们应该收到的所有消息?
我的程序如下:
#include "mpi.h"
#include <stdbool.h>
#include "stdio.h"
#include "stdlib.h"
int main(int argc, char* argv[])
{
MPI_Init(&argc, &argv);
int rank, p;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &p);
//ASSUMPTION: using 4 procs
//Don't worry about this part.
//just some stupid way to determine the receivers.
// Irrelevant to the question.
int recvs[3];
int i = 0, nei = 0;
for (; nei < 4; ++nei)
{
if (nei != rank)
{
recvs[i] = nei;
++i;
}
}
//Proc sending msgs to its neighbors.
//In this case, it's all other procs. (but in my real app, it's almost never the case)
int TAG = 0;
int buff[4] = {555, 666, 777, 888};
int local_counts[4] = {0, 0, 0, 0}; //EDIT 1
for (nei = 0; nei < 3; ++nei)
{
for (i = 0; i < 4; ++i)
{
MPI_Request req;
MPI_Isend(&buff[i], 1, MPI_INT, recvs[nei], TAG, MPI_COMM_WORLD, &req);
local_counts[recvs[nei]] += 1; //EDIT 1
}
}
//EDIT 1: tell processors how many msgs they're supposed to get
int global_counts[4];
int expectedRecvCount;
MPI_Reduce(local_counts, global_counts, 4, MPI_INT, MPI_SUM, 0, MPI_COMM_WORLD);
MPI_Scatter(global_counts, 1, MPI_INT, &expectedRecvCount, 1, MPI_INT, 0, MPI_COMM_WORLD);
//Receiving
int recvCount = 0;
MPI_Status status;
int hasMsg = 0;
int num;
do
{
MPI_Iprobe(MPI_ANY_SOURCE, TAG, MPI_COMM_WORLD, &hasMsg, &status);
if (hasMsg)
{
MPI_Recv(&num, 1, MPI_INT, status.MPI_SOURCE, TAG, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
++recvCount;
printf("\nRank %d got %d from %d", rank, num, status.MPI_SOURCE);
}
}
while (recvCount < expectedRecvCount); //EDIT 1
//while (hasMsg);
//Total number msgs received by all procs.
//Now here's where I see the problem!!!
MPI_Allreduce(&recvCount, &global_count, 1, MPI_INT, MPI_SUM, MPI_COMM_WORLD);
printf("\nRank %d receives %d msgs; Global count = %d", rank, recvCount, global_count);
MPI_Finalize();
return 0;
}
===========================================
编辑1
我能想到的一种方法是每个处理器跟踪每隔一个处理器发送的消息数。然后在发送操作完成后,我将对这些消息计数执行MPI_Reduce
后跟MPI_Scatter
。这样每个处理器就会知道它应该收到多少消息。 (见代码)
任何人都可以评论这种方法的表现吗?它是否会潜在地严重阻碍性能?
答案 0 :(得分:2)
从接收方的角度来看,如何确保已收到使用MPI_Isend发送的所有消息? - MPI不提供该功能,您只能知道所有MPI_Isend操作都已完成。
要重新解决您的问题,接收器基本上不知道发送者将发送多少消息。但是发件人知道什么时候他们没有更多的消息要发送。那么,您是否可以使用一条消息通知接收方,排名n将不再接收消息?
您的代码正在逐步解决另一个问题,如何确保所有MPI_Isend操作都已完成?
以下是基于您的示例的代码。我没有使用MPI_Iprobe,因为您的MPI_Iprobe和if语句之间没有计算。相反,我使用了MPI_Probe。
以下代码确保已发送所有消息,并且当进程收到来自所有其他进程的stopTAG
消息时,进程将停止接收消息。
#include "mpi.h"
#include <stdbool.h>
#include "stdio.h"
#include "stdlib.h"
int main(int argc, char* argv[])
{
MPI_Init(&argc, &argv);
int rank, size;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
//ASSUMPTION: using 4 procs
//Don't worry about this part.
//just some stupid way to determine the receivers.
// Irrelevant to the question.
int recvs[3];
int i = 0, nei = 0;
for (; nei < 4; ++nei)
{
if (nei != rank)
{
recvs[i] = nei;
++i;
}
}
//Proc sending msgs to its neighbors.
//In this case, it's all other procs. (but in my real app, it's almost never the case)
int TAG = 0;
int stopTAG = 1;
int buff[4] = {555, 666, 777, 888};
MPI_Request req[3*5];
for (nei = 0; nei < 3; ++nei)
{
for (i = 0; i < 4; ++i)
{
MPI_Isend(&buff[i], 1, MPI_INT, recvs[nei], TAG,
MPI_COMM_WORLD, &req[nei * 5 + i]);
}
}
for (nei = 0; nei < 3; ++nei) {
MPI_Isend(NULL, 0, MPI_CHAR, recvs[nei], stopTAG, MPI_COMM_WORLD,
&req[nei * 5 + 4]);
}
//Receiving
int recvCount = 0;
MPI_Status status;
int hasMsg = 0;
int num;
char stopArray[size];
for (i = 0; i < size; i++) {
stopArray[i] = 0;
}
stopArray[rank] = 1;
char stop;
int completedSends = 0;
do
{
MPI_Probe(MPI_ANY_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD, &status);
if (status.MPI_TAG == TAG)
{
MPI_Recv(&num, 1, MPI_INT, status.MPI_SOURCE, TAG,
MPI_COMM_WORLD, MPI_STATUS_IGNORE);
++recvCount;
printf("Rank %d got %d from %d\n", rank, num,
status.MPI_SOURCE);
}
else if (status.MPI_TAG == stopTAG) {
MPI_Recv(NULL, 0, MPI_CHAR, status.MPI_SOURCE, stopTAG,
MPI_COMM_WORLD, MPI_STATUS_IGNORE);
stopArray[status.MPI_SOURCE] = 1;
}
stop = 1;
for (i = 0; i < size; i++) {
stop &= stopArray[i];
}
if (completedSends < (3*5)) {
int indx;
MPI_Status status;
MPI_Waitany(3*5, req, &indx, &status);
completedSends++;
}
}
while (!stop && (completedSends <= 15));
//Total number msgs received by all procs.
//Now here's where I see the problem!!!
int global_count;
MPI_Allreduce(&recvCount, &global_count, 1, MPI_INT, MPI_SUM,
MPI_COMM_WORLD);
printf("\nRank %d receives %d msgs;\nGlobal count = %d\n", rank,
recvCount, global_count);
MPI_Finalize();
return 0;
}