MPI_Gatherv,MPI_Gatherv中的致命错误:待处理请求(无错误),错误堆栈:

时间:2011-11-04 13:19:38

标签: c mpi

编辑#1:

所以解决方案

线

MPI_Gatherv(buffer, rank, MPI_INT, buffer, receive_counts, receive_displacements, MPI_INT, 0, MPI_COMM_WORLD);

必须更改为

MPI_Gatherv(buffer, receive_counts[rank], MPI_INT, buffer, receive_counts, receive_displacements, MPI_INT, 0, MPI_COMM_WORLD);

再次感谢您的帮助


原始帖子:

我的代码来自DeinoMPI

当我运行mpiexec -localonly 4 skusamGatherv.exe时,正常 ok

如果我换线

int receive_counts [4] = {0,1,2, 3 };

int receive_counts [4] = {0,1,2, 1 };

编译仍然可以,但是当我运行mpiexec -localonly 4 skusamGatherv.exe时我会得到错误

我想它可以工作

感谢您的帮助


我会收到错误:

Fatal error in MPI_Gatherv: Message truncated, error stack:
MPI_Gatherv(363)........................: MPI_Gatherv failed(sbuf=0012FF4C, scou
nt=0, MPI_INT, rbuf=0012FF2C, rcnts=0012FEF0, displs=0012FED8, MPI_INT, root=0,
MPI_COMM_WORLD) failed
MPIDI_CH3_PktHandler_EagerShortSend(351): Message from rank 3 and tag 4 truncate
d; 12 bytes received but buffer size is 4
unable to read the cmd header on the pmi context, Error = -1
.
0. [0][0][0][0][0][0] , [0][0][0][0][0][0]
Error posting readv, An existing connection was forcibly closed by the remote ho
st.(10054)
unable to read the cmd header on the pmi context, Error = -1
.
Error posting readv, An existing connection was forcibly closed by the remote ho
st.(10054)
1. [1][1][1][1][1][1] , [0][0][0][0][0][0]
unable to read the cmd header on the pmi context, Error = -1
.
Error posting readv, An existing connection was forcibly closed by the remote ho
st.(10054)
2. [2][2][2][2][2][2] , [0][0][0][0][0][0]
unable to read the cmd header on the pmi context, Error = -1
.
Error posting readv, An existing connection was forcibly closed by the remote ho
st.(10054)
3. [3][3][3][3][3][3] , [0][0][0][0][0][0]

job aborted:
rank: node: exit code[: error message]
0: jan-pc-nb: 1: Fatal error in MPI_Gatherv: Message truncated, error stack:
MPI_Gatherv(363)........................: MPI_Gatherv failed(sbuf=0012FF4C, scou
nt=0, MPI_INT, rbuf=0012FF2C, rcnts=0012FEF0, displs=0012FED8, MPI_INT, root=0,
MPI_COMM_WORLD) failed
MPIDI_CH3_PktHandler_EagerShortSend(351): Message from rank 3 and tag 4 truncate
d; 12 bytes received but buffer size is 4
1: jan-pc-nb: 1
2: jan-pc-nb: 1
3: jan-pc-nb: 1
Press any key to continue . . .

我的代码:

#include "mpi.h"
#include <stdio.h>

int main(int argc, char *argv[])
{
    int buffer[6];
    int rank, size, i;
    int receive_counts[4] = { 0, 1, 2, 3 };
    int receive_displacements[4] = { 0, 0, 1, 3 };

    MPI_Init(&argc, &argv);
    MPI_Comm_size(MPI_COMM_WORLD, &size);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    if (size != 4)
    {
        if (rank == 0)
        {
            printf("Please run with 4 processes\n");fflush(stdout);
        }
        MPI_Finalize();
        return 0;
    }
    for (i=0; i<rank; i++)
    {
        buffer[i] = rank;
    }
    MPI_Gatherv(buffer, rank, MPI_INT, buffer, receive_counts, receive_displacements, MPI_INT, 0, MPI_COMM_WORLD);
    if (rank == 0)
    {
        for (i=0; i<6; i++)
        {
            printf("[%d]", buffer[i]);
        }
        printf("\n");
        fflush(stdout);
    }
    MPI_Finalize();
    return 0;
}

1 个答案:

答案 0 :(得分:1)

退一步考虑MPI_Gatherv正在做什么:它是一个MPI_Gather(在这种情况下排名为0),每个处理器可以发送不同数量的数据。

在您的示例中,等级0发送0个整数,等级1发送1个整数,等级2发送2个整数,等级3发送3个整数。

MPIDI_CH3_PktHandler_EagerShortSend(351): Message from rank 3 and tag 4 truncated; 12 bytes received but buffer size is 4

它隐藏在很多其他信息中,但是它说3级发送了3个整数(12个字节),但是0级只有1个int的空间。

查看gatherv的前三个参数:'buffer,rank,MPI_INT'。无论你设置接收到什么,等级3将始终发送3个整数。

请注意,您可以填充一个缓冲区(您可以在receive_counts 100中创建最后一项),但是您告诉MPI库使用较小的receive_counts[3]只能期望1 int ,即使你发送了3。