我正在尝试编写自己的MPI函数,该函数将计算向量中的最小数字并将其广播到所有进程。我将进程视为二叉树,并在从叶子移动到根时找到最小值。然后我通过它的孩子从根发送消息到叶子。但是当我尝试从执行过程中获得流程等级1的左子项(进程等级3)的最小值时,我只得到一个分段错误,只有4个进程从0到3进行排序。
void Communication::ReduceMin(double &partialMin, double &totalMin)
{
MPI_Barrier(MPI_COMM_WORLD);
double *leftChild, *rightChild;
leftChild = (double *)malloc(sizeof(double));
rightChild = (double *)malloc(sizeof(double));
leftChild[0]=rightChild[0]=1e10;
cout<<"COMM REDMIN: "<<myRank<<" "<<partialMin<<" "<<nProcs<<endl;
MPI_Status *status;
//MPI_Recv from 2*i+1 amd 2*i+2
if(nProcs > 2*myRank+1)
{
cout<<myRank<<" waiting from "<<2*myRank+1<<" for "<<leftChild[0]<<endl;
MPI_Recv((void *)&leftChild[0], 1, MPI_DOUBLE, 2*myRank+1, 2*myRank+1, MPI_COMM_WORLD, status); //SEG FAULT HERE
cout<<myRank<<" got from "<<2*myRank+1<<endl;
}
if(nProcs > 2*myRank+2)
{
cout<<myRank<<" waiting from "<<2*myRank+2<<endl;
MPI_Recv((void *)rightChild, 1, MPI_DOUBLE, 2*myRank+2, 2*myRank+2, MPI_COMM_WORLD, status);
cout<<myRank<<" got from "<<2*myRank+1<<endl;
}
//sum it up
cout<<myRank<<" finding the min"<<endl;
double myMin = min(min(leftChild[0], rightChild[0]), partialMin);
//MPI_Send to (i+1)/2-1
if(myRank!=0)
{
cout<<myRank<<" sending "<<myMin<<" to "<<(myRank+1)/2 -1 <<endl;
MPI_Send((void *)&myMin, 1, MPI_DOUBLE, (myRank+1)/2 - 1, myRank, MPI_COMM_WORLD);
}
double min;
//MPI_Recv from (i+1)/2-1
if(myRank!=0)
{
cout<<myRank<<" waiting from "<<(myRank+1)/2-1<<endl;
MPI_Recv((void *)&min, 1, MPI_DOUBLE, (myRank+1)/2 - 1, (myRank+1)/2 - 1, MPI_COMM_WORLD, status);
cout<<myRank<<" got from "<<(myRank+1)/2-1<<endl;
}
totalMin = min;
//MPI_send to 2*i+1 and 2*i+2
if(nProcs > 2*myRank+1)
{
cout<<myRank<<" sending to "<<2*myRank+1<<endl;
MPI_Send((void *)&min, 1, MPI_DOUBLE, 2*myRank+1, myRank, MPI_COMM_WORLD);
}
if(nProcs > 2*myRank+2)
{
cout<<myRank<<" sending to "<<2*myRank+1<<endl;
MPI_Send((void *)&min, 1, MPI_DOUBLE, 2*myRank+2, myRank, MPI_COMM_WORLD);
}
}
PS:我知道我可以使用
MPI_Barrier(MPI_COMM_WORLD);
MPI_Reduce((void *)&partialMin, (void *)&totalMin, 1, MPI_DOUBLE, MPI_MIN, 0, MPI_COMM_WORLD);
MPI_Bcast((void *)&totalMin, 1, MPI_DOUBLE, 0, MPI_COMM_WORLD);
但我想编写自己的代码以获得乐趣。
答案 0 :(得分:0)
错误在于您在接收调用中使用status参数的方式。您只需传递一个未初始化的指针,而不是传递MPI_Status
实例的地址,这会导致崩溃:
MPI_Status *status; // status declared as a pointer and never initialised
...
MPI_Recv((void *)&leftChild[0], 1, MPI_DOUBLE, 2*myRank+1, 2*myRank+1,
MPI_COMM_WORLD, status); // status is an invalid pointer here
您应该将代码更改为:
MPI_Status status;
...
MPI_Recv((void *)&leftChild[0], 1, MPI_DOUBLE, 2*myRank+1, 2*myRank+1,
MPI_COMM_WORLD, &status);
由于您没有检查代码中的所有状态,因此您只需在所有调用中传递MPI_STATUS_IGNORE
:
MPI_Recv((void *)&leftChild[0], 1, MPI_DOUBLE, 2*myRank+1, 2*myRank+1,
MPI_COMM_WORLD, MPI_STATUS_IGNORE);