该程序使用C Lagrange和MPI编写。我是MPI的新手,想要使用所有处理器进行一些计算,包括进程0.为了学习这个概念,我编写了以下简单程序。但是这个程序在接收到进程0的输入后挂在底部,并且不会将结果发送回进程0。
#include <mpi.h>
#include <stdio.h>
int main(int argc, char** argv) {
MPI_Init(&argc, &argv);
int world_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
int world_size;
MPI_Comm_size(MPI_COMM_WORLD, &world_size);
int number;
int result;
if (world_rank == 0)
{
number = -2;
int i;
for(i = 0; i < 4; i++)
{
MPI_Send(&number, 1, MPI_INT, i, 0, MPI_COMM_WORLD);
}
for(i = 0; i < 4; i++)
{ /*Error: can't get result send by other processos bellow*/
MPI_Recv(&number, 1, MPI_INT, i, 99, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
printf("Process 0 received number %d from i:%d\n", number, i);
}
}
/*I want to do this without using an else statement here, so that I can use process 0 to do some calculations as well*/
MPI_Recv(&number, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
printf("*Process %d received number %d from process 0\n",world_rank, number);
result = world_rank + 1;
MPI_Send(&result, 1, MPI_INT, 0, 99, MPI_COMM_WORLD); /* problem happens here when trying to send result back to process 0*/
MPI_Finalize();
}
运行并获得结果:
:$ mpicc test.c -o test
:$ mpirun -np 4 test
*Process 1 received number -2 from process 0
*Process 2 received number -2 from process 0
*Process 3 received number -2 from process 0
/* hangs here and will not continue */
如果可以,请向我展示一个示例或尽可能编辑上述代码。
答案 0 :(得分:1)
我真的不知道在工作域周围使用2 if
语句会出现什么问题。但无论如何,这是一个可以做的事情的例子。
我修改了您的代码以使用集体通信,因为它们比您使用的一系列发送/接收更有意义。由于初始通信具有统一的值,因此我使用MPI_Bcast()
在一次调用中执行相同操作
相反,由于结果值都不同,因此调用MPI_Gather()
非常合适
我还引用了对sleep()
的调用,只是为了模拟这些进程在发回结果之前有效。
现在代码如下:
#include <mpi.h>
#include <stdlib.h> // for malloc and free
#include <stdio.h> // for printf
#include <unistd.h> // for sleep
int main( int argc, char *argv[] ) {
MPI_Init( &argc, &argv );
int world_rank;
MPI_Comm_rank( MPI_COMM_WORLD, &world_rank );
int world_size;
MPI_Comm_size( MPI_COMM_WORLD, &world_size );
// sending the same number to all processes via broadcast from process 0
int number = world_rank == 0 ? -2 : 0;
MPI_Bcast( &number, 1, MPI_INT, 0, MPI_COMM_WORLD );
printf( "Process %d received %d from process 0\n", world_rank, number );
// Do something usefull here
sleep( 1 );
int my_result = world_rank + 1;
// Now collecting individual results on process 0
int *results = world_rank == 0 ? malloc( world_size * sizeof( int ) ) : NULL;
MPI_Gather( &my_result, 1, MPI_INT, results, 1, MPI_INT, 0, MPI_COMM_WORLD );
// Process 0 prints what it collected
if ( world_rank == 0 ) {
for ( int i = 0; i < world_size; i++ ) {
printf( "Process 0 received result %d from process %d\n", results[i], i );
}
free( results );
}
MPI_Finalize();
return 0;
}
编译后如下:
$ mpicc -std=c99 simple_mpi.c -o simple_mpi
它运行并给出了这个:
$ mpiexec -n 4 ./simple_mpi
Process 0 received -2 from process 0
Process 1 received -2 from process 0
Process 3 received -2 from process 0
Process 2 received -2 from process 0
Process 0 received result 1 from process 0
Process 0 received result 2 from process 1
Process 0 received result 3 from process 2
Process 0 received result 4 from process 3
答案 1 :(得分:1)
实际上,进程1-3确实将结果发送回处理器0.但是,处理器0在此循环的第一次迭代中停留:
for(i=0; i<4; i++)
{
MPI_Recv(&number, 1, MPI_INT, i, 99, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
printf("Process 0 received number %d from i:%d\n", number, i);
}
在第一次MPI_Recv调用中,处理器0将阻止等待从其自身接收带有标记99的消息,该消息0尚未发送。
通常,处理器向自己发送/接收消息是一个坏主意,尤其是使用阻塞调用。 0已经具有内存中的值。它不需要发送给自己。
但是,解决方法是从i=1
for(i=1; i<4; i++)
{
MPI_Recv(&number, 1, MPI_INT, i, 99, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
printf("Process 0 received number %d from i:%d\n", number, i);
}
现在运行代码将为您提供:
Process 1 received number -2 from process 0
Process 2 received number -2 from process 0
Process 3 received number -2 from process 0
Process 0 received number 2 from i:1
Process 0 received number 3 from i:2
Process 0 received number 4 from i:3
Process 0 received number -2 from process 0
请注意,使用Gilles提到的MPI_Bcast和MPI_Gather是一种更加高效和标准的数据分发/收集方式。