我正在使用Open MPI库来实现以下算法:我们有两个进程p1
和p2
。他们都在执行一些迭代,并且在每次迭代结束时,他们都会传达他们的结果。问题是执行不一定是平衡的,所以p1
可以在p2
执行的时间内执行10次迭代1.尽管如此,我希望p2
读取最后一次的最新结果由p1
执行的迭代。
因此,我的想法是p1
在每次迭代时发送其结果。但是,在从迭代i
发送结果之前,它应检查p2
是否实际读取了迭代i-1
中的信息。如果没有,它应该取消之前的发送,这样当p2
从p1
读取时,它会读取最新的结果。
不幸的是,我不知道该怎么做。我尝试过使用MPI_Cancel,如下面的代码所示:
int main (int argc, char *argv[]){
int myrank, numprocs;
MPI_Status status;
MPI_Request request;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
if(myrank == 0){
int send_buf = 1, flag;
MPI_Isend(&send_buf, 1, MPI_INT, 1, 123, MPI_COMM_WORLD,
&request);
MPI_Cancel(&request);
MPI_Wait(&request, &status);
MPI_Test_cancelled(&status, &flag);
if (flag) printf("Send cancelled\n");
else printf("Send NOT cancelled\n");
send_buf = 2;
MPI_Isend(&send_buf, 1, MPI_INT, 1, 123, MPI_COMM_WORLD,
&request);
}
else {
sleep(5);
int msg;
MPI_Recv(&msg, 1, MPI_INT, 0, 123,
MPI_COMM_WORLD, &status);
printf("%d\n", msg);
}
MPI_Finalize();
return 0;
}
但是当我执行时,它表示发送无法取消,p2
打印1而不是2。
我想知道是否有任何方法可以实现我提议的内容,或者是否有替代方法可以对p1
和p2
之间的行为进行编码。
答案 0 :(得分:5)
我会改变对通讯的控制。 p1
应该发出信号表明它已准备好接收消息,而p2
将仅发送消息,而不是p1
发送必须取消的不必要消息。与此同时,p1
只是用最新结果覆盖其发送缓冲区。
在(未经测试的)代码中:
if ( rank == 0 )
{
int ready;
MPI_Request p2_request;
MPI_Status p2_status;
// initial request
MPI_Irecv(&ready, 1, MPI_INT, 1, 123, MPI_COMM_WORLD, &p2_request);
for (int i=0; true; i++)
{
sleep(1);
MPI_Test(&p2_request, &ready, &p2_status);
if ( ready )
{
// blocking send: p2 is ready to receive
MPI_Send(&i, 1, MPI_INT, 1, 123, MPI_COMM_WORLD);
// post new request
MPI_Irecv(&ready, 1, MPI_INT, 1, 123, MPI_COMM_WORLD, &p2_request);
}
}
}
else
{
int msg;
MPI_Status status;
while (true)
{
sleep(5);
// actual message content doesn't matter, just let p1 know we're ready
MPI_Send(&msg, 1, MPI_INT, 0, 123, MPI_COMM_WORLD);
// receive message
MPI_Recv(&msg, 1, MPI_INT, 0, 123, MPI_COMM_WORLD, &status);
}
}
就像我说的那样,这是未经测试的代码,但你可能会看到我在那里得到的东西。只有当出现严重错误时才应使用MPI_Cancel
:在正常执行期间不应取消任何消息。
答案 1 :(得分:5)
另一种方法完全是使用MPI单向通信(例如,http://www.linux-mag.com/id/1793)。但请注意,执行被动通信,这是你真正想要的,这是相当棘手的(虽然成对,mpi_win_post和mpi_win_start更容易)并且单方面的东西有望在MPI-3中全部改变,所以我不喜欢我知道我建议你走多远的路。
与你在这里首次尝试的内容更直接相关,而不是取消消息(如上所述,这非常激烈),通过所有排队的消息可能要容易得多(MPI保证消息不会超过彼此 - 唯一的警告是,如果您正在使用MPI_THREAD_MULTIPLE并且在一个MPI任务中发送多个线程,在这种情况下顺序定义了顺序):
#include <stdio.h>
#include <mpi.h>
#include <stdlib.h>
#include <unistd.h>
#include <math.h>
void compute() {
const int maxusecs=500;
unsigned long sleepytime=(unsigned long)round(((float)rand()/RAND_MAX)*maxusecs);
usleep(sleepytime);
}
int main(int argc, char** argv)
{
int rank, size, i;
int otherrank;
const int niters=10;
const int tag=5;
double newval;
double sentvals[niters+1];
double othernewval;
MPI_Request reqs[niters+1];
MPI_Status stat;
int ready;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
if (size != 2) {
fprintf(stderr,"This assumes 2 processes\n");
MPI_Finalize();
exit(-1);
}
otherrank = (rank == 0 ? 1 : 0);
srand(rank);
compute();
newval = rank * 100. + 0;
sentvals[0] = newval;
MPI_Isend(&(sentvals[0]), 1, MPI_DOUBLE, otherrank, tag, MPI_COMM_WORLD, &(reqs[0]));
MPI_Recv (&othernewval, 1, MPI_DOUBLE, otherrank, tag, MPI_COMM_WORLD, &stat);
for (i=0; i<niters; i++) {
MPI_Iprobe(otherrank, tag, MPI_COMM_WORLD, &ready, &stat);
while (ready) {
MPI_Recv(&othernewval, 1, MPI_DOUBLE, otherrank, tag, MPI_COMM_WORLD, &stat);
printf("%s[%d]: Reading queued data %lf:\n",
(rank == 0 ? "" : "\t\t\t\t"), rank, othernewval);
MPI_Iprobe(otherrank, tag, MPI_COMM_WORLD, &ready, &stat);
}
printf("%s[%d]: Got data %lf, computing:\n",
(rank == 0 ? "" : "\t\t\t\t"), rank, othernewval);
compute();
/* update my data */
newval = rank * 100. + i + 1;
printf("%s[%d]: computed %lf, sending:\n",
(rank == 0 ? "" : "\t\t\t\t"), rank, newval);
sentvals[i+1] = newval;
MPI_Isend(&(sentvals[i+1]), 1, MPI_DOUBLE, otherrank, tag, MPI_COMM_WORLD, &(reqs[0]));
}
MPI_Finalize();
return 0;
}
运行它会给你(注意,因为发送数据并不意味着它在打印时收到):
[0]: Got data 100.000000, computing:
[1]: Got data 0.000000, computing:
[0]: computed 1.000000, sending:
[0]: Got data 100.000000, computing:
[1]: computed 101.000000, sending:
[1]: Got data 0.000000, computing:
[0]: computed 2.000000, sending:
[0]: Got data 100.000000, computing:
[1]: computed 102.000000, sending:
[1]: Reading queued data 1.000000:
[1]: Got data 1.000000, computing:
[0]: computed 3.000000, sending:
[0]: Reading queued data 101.000000:
[0]: Got data 101.000000, computing:
[1]: computed 103.000000, sending:
[1]: Reading queued data 2.000000:
[1]: Got data 2.000000, computing:
[0]: computed 4.000000, sending:
[1]: computed 104.000000, sending:
[0]: Reading queued data 102.000000:
[1]: Reading queued data 3.000000:
[1]: Got data 3.000000, computing:
[0]: Got data 102.000000, computing:
[0]: computed 5.000000, sending:
[0]: Reading queued data 103.000000:
[0]: Got data 103.000000, computing:
[1]: computed 105.000000, sending:
[1]: Reading queued data 4.000000:
[1]: Got data 4.000000, computing:
[0]: computed 6.000000, sending:
[0]: Reading queued data 104.000000:
[0]: Got data 104.000000, computing:
[1]: computed 106.000000, sending:
[1]: Reading queued data 5.000000:
[1]: Got data 5.000000, computing:
[0]: computed 7.000000, sending:
[0]: Reading queued data 105.000000:
[0]: Got data 105.000000, computing:
[1]: computed 107.000000, sending:
[1]: Reading queued data 6.000000:
[1]: Got data 6.000000, computing:
[0]: computed 8.000000, sending:
[0]: Reading queued data 106.000000:
[0]: Got data 106.000000, computing:
[1]: computed 108.000000, sending:
[1]: Reading queued data 7.000000:
[1]: Got data 7.000000, computing:
[0]: computed 9.000000, sending:
[0]: Reading queued data 107.000000:
[0]: Got data 107.000000, computing:
[1]: computed 109.000000, sending:
[1]: Reading queued data 8.000000:
[1]: Got data 8.000000, computing:
[0]: computed 10.000000, sending:
[1]: computed 110.000000, sending:
请注意,这只是演示代码,最终版本真的需要在那里进行waitalls和更多iprobes以释放任何待处理的请求并刷新任何等待的消息。
答案 2 :(得分:0)
您的环境和MPI分发是否支持多线程?如果是这样,您可以在P1中创建一个计算值的线程,并将每个迭代的结果存储在与P1的主线程共享的变量中(通过信号量保护写入) 正如上面的suszterpatt所建议的那样,让P2向P1发送一条“我准备好”的消息,并让P1用最近一次迭代的值作出响应。