致命错误:MPI_Gatherv

时间:2018-01-11 04:59:49

标签: c++ parallel-processing mpi

我是MPI的新手,尝试使用MPI_Gatherv。有两个问题,第一,函数不收集所有处理器的所有项目,有时它给我致命的错误。我不明白发生了什么!

每个处理器都有一个向量,其中包含最小数字的索引,每个处理器中向量的大小可能不同。

哪一部分错了? 任何帮助,将不胜感激。

我的代码:

MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
MPI_Comm_size(MPI_COMM_WORLD, &comm_size);
vector<int> localMin;
for (int i=0; i<numPerProc; i++)
{
    if (receive_buffer[i]==min) {
        int adjIndex=numPerProc*my_rank+i;
        localMin.push_back(adjIndex);
    }
}
// I thought I might be better use array instead of vector:
int nelements=localMin.size();
int* localMinArray=new int[nelements];
for (int i=0; i<nelements; i++) {
    localMinArray[i]=localMin[i];
}

int *counts = new int[comm_size];

// Each process tells the root how many elements it holds
MPI_Gather(&nelements, 1, MPI_INT, counts, 1, MPI_INT, 0, MPI_COMM_WORLD);

// Displacements in the receive buffer for MPI_GATHERV
int *disps = new int[comm_size];
// Displacement for the first chunk of data - 0
for (int i = 0; i < comm_size; i++)
    disps[i] = (i > 0) ? (disps[i-1] + counts[i-1]) : 0;

int *allMin;
if (my_rank == 0)
    // disps[size-1]+counts[size-1] == total number of elements
    allMin = new int[disps[comm_size-1]+counts[comm_size-1]];


// Collect everything into the root
MPI_Gatherv(&localMinArray, nelements, MPI_INT, &allMin, counts, disps, MPI_INT, 0, MPI_COMM_WORLD);

结果如下:

Vector values in Rank  0: 4,11,
Vector values in Rank  1: 
Vector values in Rank  2: 24,31,
count values:
2,0,2,
disps values:
0,2,2,
allMin values:
4,11,0,-268435456,

这些是我有时会遇到的错误:

*** An error occurred in MPI_Gatherv
*** reported by process [3485859841,1]
*** on communicator MPI_COMM_WORLD
*** MPI_ERR_COUNT: invalid count argument
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)

1 个答案:

答案 0 :(得分:0)

我通过执行以下更改来修复此问题:

 MPI_Gatherv(*&localMinArray, nelements, MPI_INT,*&allMin, counts, disps, MPI_INT, 0, MPI_COMM_WORLD);