我学会了使用一些MPI功能。当我尝试使用MPI_Reduce
时,运行代码时会检测到堆栈粉碎:
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
void main(int argc, char **argv) {
int i, rank, size;
int sendBuf, recvBuf, count;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
sendBuf = rank;
count = size;
MPI_Reduce(&sendBuf, &recvBuf, count, MPI_INT,
MPI_SUM, 0, MPI_COMM_WORLD);
if (rank == 0) {
printf("Sum is %d\n", recvBuf);
}
MPI_Finalize();
}
我的代码似乎很好。它将在recvBuf
中使用进程0打印所有排名的总和。在这种情况下,如果我使用10个进程Sum is 45
运行我的代码,它将打印mpirun -np 10 myexecutefile
。但我不知道为什么我的代码有错误:
Sum is 45
*** stack smashing detected ***: example6 terminated
[ubuntu:06538] *** Process received signal ***
[ubuntu:06538] Signal: Aborted (6)
[ubuntu:06538] Signal code: (-6)
[ubuntu:06538] *** Process received signal ***
[ubuntu:06538] Signal: Segmentation fault (11)
[ubuntu:06538] Signal code: (128)
[ubuntu:06538] Failing at address: (nil)
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 0 on node ubuntu exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
问题是什么,我该如何解决?
答案 0 :(得分:2)
在MPI_Reduce(&sendBuf, &recvBuf, count, MPI_INT,
MPI_SUM, 0, MPI_COMM_WORLD);
参数count
必须是number of elements in send buffer。由于sendBuf
是一个整数,请尝试count = 1;
而不是count = size;
。
Sum is 45
被正确打印的原因很难解释。超出界限的值是未定义的行为:问题可能仍未被注意,或者在Sum is 45
打印之前可能已经引发了分段错误。未定义行为的魔力......