MPI_Isend分段错误

时间:2012-11-24 17:34:27

标签: c mpi

我在执行MPI非阻塞发送时遇到问题,导致机器因分段故障而崩溃。所有机器都正确接收数据,但MPI_Waitall()操作期间id为0的机器崩溃。任何人都可以确定导致问题的原因是什么?谢谢!

这是程序的源代码和运行时我得到的错误报告:

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>

#define BLOCK_LOW(id,p,n) ((id)*(n)/(p))
#define BLOCK_HIGH(id,p,n) (BLOCK_LOW((id)+1,p,n)-1)
#define BLOCK_SIZE(id,p,n) (BLOCK_HIGH(id,p,n)-BLOCK_LOW(id,p,n)+1)
#define BLOCK_OWNER(id,p,n) (((p)*((id)+1)-1)/(n))

#define LENGTH 100

int main(int argc, char *argv[]) {
    int id, p, i;
    MPI_Request* sendRequests;
    MPI_Status* sendStatuses;
    MPI_Request receiveRequest;
    MPI_Status receiveStatus;

    int array[LENGTH];
    int array2[LENGTH];

    MPI_Init(&argc, &argv);
    MPI_Barrier(MPI_COMM_WORLD);

    for (i = 0; i < LENGTH; i++) {
        array[i] = i * 5;
        array2[i] = 0;
    }


    MPI_Comm_rank(MPI_COMM_WORLD, &id);
    MPI_Comm_size(MPI_COMM_WORLD, &p);

    if (id == 0) {
        sendRequests = malloc((p-1) * sizeof(MPI_Request));

        for (i = 1; i < p; i++) {
            MPI_Isend(array + BLOCK_LOW(i-1, p-1, LENGTH), BLOCK_SIZE(i-1, p-1, LENGTH), MPI_INT, i, 0, MPI_COMM_WORLD, &sendRequests[i-1]);
        }

        MPI_Waitall(p-1, sendRequests, sendStatuses);
    } else {
        MPI_Recv(array2, BLOCK_SIZE(id-1, p-1, LENGTH), MPI_INT, 0, 0, MPI_COMM_WORLD, &receiveStatus);

        for (i = 0; i < BLOCK_SIZE(id-1, p-1, LENGTH); i++) {
            printf("Element %d (%d): %d\n", i, i + BLOCK_LOW(id-1, p-1, LENGTH), array2[i]);
        }
    }

    MPI_Barrier(MPI_COMM_WORLD);
    MPI_Finalize();
    return 0;
}

这是我运行代码时遇到的错误:

[lin12p5:13467] *** Process received signal ***
[lin12p5:13467] Signal: Segmentation fault (11)
[lin12p5:13467] Signal code: Invalid permissions (2)
[lin12p5:13467] Failing at address: 0x400f30
[lin12p5:13467] [ 0] /lib/libpthread.so.0(+0xeff0) [0x7fa96ab4eff0]
[lin12p5:13467] [ 1] /usr/lib/libmpi.so.0(+0x37f01) [0x7fa96bad5f01]
[lin12p5:13467] [ 2] /usr/lib/libmpi.so.0(PMPI_Waitall+0xb3) [0x7fa96bb06b73]
[lin12p5:13467] [ 3] mpi-test(main+0x232) [0x400da6]
[lin12p5:13467] [ 4] /lib/libc.so.6(__libc_start_main+0xfd) [0x7fa96a7fcc8d]
[lin12p5:13467] [ 5] mpi-test() [0x400ab9]
[lin12p5:13467] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 13467 on node lab12p5 exited on signal 11     (Segmentation fault).
--------------------------------------------------------------------------

[lin13p5][[33088,1],1][../../../../../../ompi/mca/btl/tcp/btl_tcp_frag.c:216:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)

1 个答案:

答案 0 :(得分:4)

您尚未为sendStatuses分配任何空间。您需要像malloc()那样sendRequests一些空间。完成后你还应该free()这些以防止内存泄漏。