MPI_Waitsome失败了

时间:2016-12-06 16:46:18

标签: c mpi nonblocking

我正在尝试使用MPI_Waitsome函数,但是根据处理器等级使用不同的索引数组(和outcount)。所有16个处理器都返回相同的错误:

[0] fatal error
Fatal error in MPI_Waitsome: Invalid MPI_Request, errror stack:
MPI_Waitsome(count=13, req_array="some address", out_count="some address", indices="some address", status_array="some address") failed
Invalid MPI_Request

有问题的代码如下:

#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>

int row(int inrank);
int column(int inrank);

int main()
{
    int numtasks, rank, len, rc, i, n, tag = 1, outbufp, inbuf1[3], inbuf2[4];

    MPI_Status stats[13];
    MPI_Request reqs[13];

    for (i = 0; i<3; i++) {
        inbuf1[i] = MPI_PROC_NULL;
    }

    for (i = 0; i<4; i++) {
        inbuf2[i] = MPI_PROC_NULL;
    }

    char hostname[MPI_MAX_PROCESSOR_NAME];

    MPI_Init(NULL,NULL);

    MPI_Comm_size(MPI_COMM_WORLD, &numtasks);

    MPI_Comm_rank(MPI_COMM_WORLD, &rank);

    MPI_Get_processor_name(hostname, &len);

    printf("Number of tasks= %d My rank= %d Running on %s\n",numtasks,rank,hostname);

    outbufp = rank;

    n = 0;

    for(i = (row(rank) * 4); i < ((row(rank) + 1) * 4); i++){       
        if(i != rank){          
            MPI_Isend(&outbufp, 1, MPI_INT, i, tag, MPI_COMM_WORLD, &reqs[n]);
            n++;
        }       
    }

    for(i = row(rank); i < 16; i += 4){
        if(row(i) != row(rank)){        
            MPI_Isend(&outbufp, 1, MPI_INT, i, tag, MPI_COMM_WORLD, &reqs[n]);
            n++;
        }       
    }

    for(i = (row(rank) * 4); i < ((row(rank) + 1) * 4); i++){   
        if(i != rank){          
            MPI_Irecv(&inbuf1, 1, MPI_INT, i, tag, MPI_COMM_WORLD, &reqs[n]);
            n++;
        }       
    }
    if(row(rank != column(rank))){
        for(i = (column(rank)*4); i < ((column(rank) + 1)*4); i++){         
                MPI_Irecv(&inbuf2, 1, MPI_INT, i, tag, MPI_COMM_WORLD, &reqs[n]);
                n++;        
        }
    }

    int *indic;
    indic = (int*) malloc(100*sizeof(int));

    for(i = 0; i < n; i++){
        indic[i] = i;
    }
    MPI_Waitsome(13, reqs, n, indic, stats);

    printf("inbuf(y) =\t");

    for(i=0;i<3;i++){
        printf("%d\t",inbuf1[i]);
    }
    if(row(rank) != column(rank)){
        for(i=0;i<4;i++){
            printf("%d\t",inbuf2[i]);
        }
    }
    MPI_Finalize();

    return 0;
}

int row(int inrank){
    return (inrank - inrank%4)/4;
}
int column(int inrank){
    return (inrank%4);
}

我的目标是在笛卡尔网格中的处理器之间传输数据,我知道新的笛卡尔虚拟拓扑,但是,我希望首先对一个版本进行硬编码。我正在考虑在if语句中使用2个MPI_Waitall函数,但是我不明白为什么MPI_Waitsome函数当前不工作。我猜它可能与索引数组是动态的这一事实有关,但是,必须如此,因为它的大小不同,具体取决于处理器是对角线还是对角线。

编辑:使用两个MPI_Waitall函数解决,如下所示

if(row(rank) != column(rank)){
    for(i = (column(rank)*4); i < ((column(rank) + 1)*4); i++){         
            MPI_Irecv(&inbuf2[m], 1, MPI_INT, i, rank, MPI_COMM_WORLD, &reqs2[m]);
            m++;        
    }
}

//printf("Proc %d\n", rank);

MPI_Waitall(9, reqs, stats);

if(row(rank)!=column(rank)){
    MPI_Waitall(4, reqs2, stats2);
}

1 个答案:

答案 0 :(得分:1)

首先是第一件事。 收听编译器警告。它会告诉你类似的东西:

mpiwaitsome.c: In function ‘main’:
mpiwaitsome.c:83:28: warning: passing argument 3 of ‘MPI_Waitsome’ makes pointer from integer without a cast [-Wint-conversion]
     MPI_Waitsome(13, reqs, n, indic, stats);
                            ^
In file included from mpiwaitsome.c:1:0:
/usr/include/mpi.h:1817:20: note: expected ‘int *’ but argument is of type ‘int’
 OMPI_DECLSPEC  int MPI_Waitsome(int incount, MPI_Request array_of_requests[],
                    ^~~~~~~~~~~~

它显示了始终修复警告的重要性。编译器非常清楚地告诉您,您使用的API错误。然后,您可以查看文档:

outcount
    number of completed requests (integer) 
array_of_indices
    array of indices of operations that completed (array of integers) 
array_of_statuses
    array of status objects for operations that completed (array of Status).

是否所有输出参数。你这样使用它。

int *indic = malloc(n*sizeof(int));
int outcount;
MPI_Waitsome(n, reqs, &outcount, indic, stats);

我不确定你想通过设置这样的指数来告诉MPI。您需要索引确定实际完成的操作。 一些意味着并非所有人都可以完成。除非您知道相应的操作已完成,否则不得读取接收缓冲区。如果要读取所有接收缓冲区并重用发送缓冲区,请使用MPI_Waitall

Apropros缓冲区,您为这些循环的每个迭代提供相同的指针&inbuf1 / &inbuf2。通过多个并发接收操作写错同一位置。另外要非常小心 - 如果inbuf不是int[],而是int*,那么&inbuf将是指针的地址......您可能想要一些东西:

MPI_Irecv(&inbuf1[magical-formula-for-index], 1, ...