我在C中用MPI编写了以下代码:
#include <mpi.h>
#include <stdio.h>
int main(int argc, char *argv[])
{
int size, rank;
MPI_Status status;
int buf[1000];
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
if (rank == 0) {
int i = 0;
while (i != 1000) {
buf[i] = i;
i++;
}
MPI_Send(buf, 999, MPI_INT, 1, 1, MPI_COMM_WORLD);
printf("msg has been sent\n");
}
if (rank == 1) {
int sz = sizeof(buf);
int lst = buf[sz-1];
MPI_Recv(buf, 999, MPI_INT, 0, 1, MPI_COMM_WORLD, &status);
printf("la taille du buf %d et dernier %d", sz, lst);
}
MPI_Finalize();
}
在运行之后它会给出这样的信息:
msg has been sente
[blitzkrieg-TravelMate-P253:03395] *** Process received signal ***
[blitzkrieg-TravelMate-P253:03395] Signal: Segmentation fault (11)
[blitzkrieg-TravelMate-P253:03395] Signal code: Address not mapped (1)
[blitzkrieg-TravelMate-P253:03395] Failing at address: 0xbfee8574
[blitzkrieg-TravelMate-P253:03395] [0] [0xb772d40c]
[blitzkrieg-TravelMate-P253:03395] [1] mpii(main+0x12f) [0x8048883]
[blitzkrieg-TravelMate-P253:03395] [2] /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0xb74c84d3]
[blitzkrieg-TravelMate-P253:03395] [3] mpii() [0x80486c1]
[blitzkrieg-TravelMate-P253:03395] *** End of error message ***
mpirun注意到节点blitzkrieg上的进程排名为1,PID为3395 -
TravelMate-P253退出信号11(分段故障)。
任何建议都有助于thnx。
答案 0 :(得分:9)
堆栈跟踪显示错误不在问题标题建议的MPI_Recv
中。错误实际上在这里:
int sz = sizeof(buf);
int lst = buf[sz-1]; // <---- here
由于buf
是int
的数组,sizeof(buf)
以字节为单位返回其大小,因此sz
设置为数组中元素数的4倍。访问buf[sz-1]
超出了buf
的范围,进入了进程堆栈上方未映射的内存区域。
您应该将数组的总大小除以其中一个元素的大小,例如:第一个:
int sz = sizeof(buf) / sizeof(buf[0]);
int lst = buf[sz-1];