我使用MPI解决了非结构化网格划分的拉普拉斯方程。我计划先完成从邻居分区发送和接收数据,然后对每个处理器进行计算。 MPI_Waitall
用于等待所有MPI_Isend()
和MPI_Irecv()
完成,但问题是所有处理器都通过MPI_Waitall
并在读取接收到的缓冲区数据时卡在那里,因为每个处理器实际上都没有#39; t接收任何数据(MPI_Testall的标志返回0)。据我了解,MPI_Irecv
应该在MPI_Waitall
返回之前收到数据。
double **sbuf = calloc(partition->ptn_nnbr[my_id], sizeof(double *));
double **rbuf = calloc(partition->ptn_nnbr[my_id], sizeof(double *));
for (i = 0; i < partition->ptn_nnbr[my_id]; i++)
{
//rbuf[i] = calloc(partition->ptn_cnt[my_id][k1], sizeof(double));
rbuf[i] = calloc(MAX_nnode, sizeof(double));
sbuf[i] = calloc(MAX_nnode, sizeof(double));
}
nrm = 1; // nrm = max(abs(r[i])), i = 1..n
iter = 0;
printf("Entering jacobi; iterations = %d, error norm = %e\n", iter, nrm);
while (nrm > TOL && iter<4 ){
init_boundary_conditions_ptn(x_ptn, mesh, my_id, partition);
iter++;
int req_idx= 0;
int idx = 0;
MPI_Request *request = (MPI_Request *) calloc(2 * partition->ptn_nnbr[my_id], sizeof(MPI_Request));
MPI_Status *status = calloc(2 * partition->ptn_nnbr[my_id], sizeof(MPI_Status));
int *flag = calloc(2 * partition->ptn_nnbr[my_id], sizeof(int));
for (k1 = 0; k1 < partition->nptn; k1++)
{
if (partition->ptn_list[my_id][k1] != NULL)
{
for (i = 0; i < partition->ptn_cnt[k1][my_id]; i++)
{
sbuf[idx][i] = x_ptn->val[partition->ptn_list[k1][my_id][i] - partition->ptn[my_id] + 1];
}
MPI_Isend(sbuf[idx], partition->ptn_cnt[k1][my_id], MPI_DOUBLE, k1, TAG, MPI_COMM_WORLD, &request[req_idx]);
//printf("isend done from nbr %d for partition %d \n", k1, my_id);
req_idx++;
idx++;
}
}
idx = 0;
for (k1 = 0; k1 < partition->nptn; k1++)
{
if (partition->ptn_list[my_id][k1] != NULL)
{
MPI_Irecv(rbuf[idx], partition->ptn_cnt[my_id][k1], MPI_DOUBLE, k1, TAG, MPI_COMM_WORLD, &request[req_idx]);
//printf("irecv done from nbr %d for partition %d \n", k1, my_id);
req_idx++;
idx++;
}
}
printf("partition %d is waiting \n", my_id);
MPI_Testall(2 * partition->ptn_nnbr[my_id],request,flag, status);
for (i = 0; i < 2 * partition->ptn_nnbr[my_id]; i++)
{
printf("flag[%d] is %d from partition %d\n", i, flag[i], my_id);
}
MPI_Waitall(2 * partition->ptn_nnbr[my_id], request, status);
printf("partition %d pass MPI_Wait \n", my_id);
for (k1 = 0; k1 < partition->nptn; k1++)
{
if (partition->ptn_list[my_id][k1] != NULL)
{
MPI_Probe(k1, TAG, MPI_COMM_WORLD, status1);
MPI_Get_count(status1, MPI_DOUBLE, &count);
printf("count is %d from nbr %d \n", count, k1);
for (i = 0; i < count; i++)
{
x->val[partition->ptn_list[my_id][k1][i]] = rbuf[idx][i];
}
}
}
//printf("exchange complete from partition %d\n", my_id);
jacobi_step_csr_matrix(A_ptn, x, b_ptn, y_ptn); // y = inv(D)*(b + (D-A)*x), D = diag(A)
copy_vector(y_ptn, x_ptn);
MPI_Gatherv(x_ptn->val, x_ptn->n, MPI_DOUBLE, x->val, x_count, x_dis, MPI_DOUBLE,0, MPI_COMM_WORLD);
if (my_id == 0)
{
init_boundary_conditions(x, mesh, partition->perm);
matvec_csr_matrix(A, x, r); // r = A*x
sxapy(b, -1.0, r); // r = b - r
zero_boundary_conditions(r, mesh, partition->perm);
nrm = norm_inf(r);
}
MPI_Bcast(&nrm, 1, MPI_DOUBLE, 0, MPI_COMM_WORLD);
printf("nrm is %f from partition %d in iter %d \n", nrm, my_id, iter);
free(request);
free(status);
输出结果为:
Processor 0 start Jacobi
MAx_node is 2 from partition 0
Entering jacobi; iterations = 0, error norm = 1.000000e+00
Processor 2 start Jacobi
MAx_node is 2 from partition 2
Entering jacobi; iterations = 0, error norm = 1.000000e+00
Processor 3 start Jacobi
MAx_node is 2 from partition 3
Entering jacobi; iterations = 0, error norm = 1.000000e+00
Processor 1 start Jacobi
MAx_node is 2 from partition 1
Entering jacobi; iterations = 0, error norm = 1.000000e+00
partition 3 is waiting
flag[0] is 0 from partition 3
flag[1] is 0 from partition 3
flag[2] is 0 from partition 3
flag[3] is 0 from partition 3
partition 3 pass MPI_Wait
partition 0 is waiting
flag[0] is 0 from partition 0
flag[1] is 0 from partition 0
flag[2] is 0 from partition 0
flag[3] is 0 from partition 0
partition 0 pass MPI_Wait
partition 2 is waiting
flag[0] is 0 from partition 2
flag[1] is 0 from partition 2
flag[2] is 0 from partition 2
flag[3] is 0 from partition 2
partition 2 pass MPI_Wait
partition 1 is waiting
flag[0] is 0 from partition 1
flag[1] is 0 from partition 1
flag[2] is 0 from partition 1
flag[3] is 0 from partition 1
partition 1 pass MPI_Wait
答案 0 :(得分:1)
在我看来,您对MPI中非阻塞通信的理解有些模糊。首先,您使用了错误的测试呼叫。 MPI_Testall
会输出标量完成标记,指示在调用MPI_Testall
之前所有请求是否已完成。如果您使用MPI_Testsome
代替,那么您会注意到只有某些请求(或更可能没有)已经完成。 MPI标准允许推迟非阻塞操作的进展,并且仅在某些情况下进展。完成只能得到保证:
MPI_Wait{all|some|any}
后(在请求完成之前根本没有回复); MPI_Test{all|some|any}
返回真正的完成标志之后。无法保证对MPI_Test...
的单次调用将导致完成 - 测试函数将被重复称为 ,直到该标志指示请求完成为止。出于性能原因,大多数MPI库都是单线程的,即没有后台线程可以进行非阻塞调用,除非在某些特定的架构中实现硬件的进展。因此,需要定期调用MPI库才能实现非阻塞通信,并且您希望在调用MPI_Testall
时所有非阻塞请求都已完成,这是完全错误的。
此外,您的计划仍然停留在MPI_Probe
。这是一个阻止呼叫,必须在之前接收消息,而不是之后。 MPI_Irecv
已收到该消息,探测呼叫正在等待另一条永不到达的消息。不要致电MPI_Probe
。将status
数组的相关元素传递给MPI_Get_count
。
作为最后一点,您将2 * partition->ptn_nnbr[my_id]
作为请求数传递。确保此值实际匹配req_idx
中累积的值,否则您的程序将崩溃。非活动请求必须设置为MPI_REQUEST_NULL
,并且对于非活动请求,Open MPI和MPICH都不会使用NULL
(在您的情况下通过调用calloc(3)
设置)。您应该将req_idx
作为请求数传递。