Question

我希望实现并行化代码，将每个处理器的子矩阵收集到主处理器中的矩阵中。

例如，我想要实现的是这样的：

MPI_Datatype rowtype_temp, rowtype, coltype_temp, coltype, mtype_temp, mtype;
double **f, **f_local;
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
MPI_Comm_size(MPI_COMM_WORLD, &comm_sz);
local_n = n / comm_sz;

MPI_Type_vector(n, 1, local_n, MPI_DOUBLE, &rowtype_temp);
MPI_Type_commit(&rowtype_temp);
MPI_Type_create_resized(rowtype_temp, 0, sizeof(double), &rowtype);
MPI_Type_commit(&rowtype);

MPI_Type_vector(n, 1, n, MPI_DOUBLE, &coltype_temp);
MPI_Type_commit(&coltype_temp);
MPI_Type_create_resized(coltype_temp, 0, sizeof(double), &coltype);
MPI_Type_commit(&coltype);

MPI_Type_vector(local_n, 1, comm_sz, coltype, &mtype_temp);
MPI_Type_commit(&mtype_temp);
MPI_Type_create_resized(mtype_temp, 0, sizeof(double), &mtype);
MPI_Type_commit(&mtype);

f_local = (double**) malloc (n * sizeof(double *));
for (i = 0; i < n; i++)  f_local[i] = (double *) malloc (local_n * sizeof(double));
f = (double**) malloc (n  * sizeof(double *));
for (i = 0; i < n ; i++) f[i] = (double *) malloc (n * sizeof(double));


MPI_Gather(&f_local[0][0], local_n, rowtype, &f[0][0], 1 , mtype, 0, MPI_COMM_WORLD);

这是我的代码的一部分。

processor 0 : a b a b
              0 0 0 0 
              0 0 0 0
              0 0 0 0
              0 0 0 0

，此代码中的输出类似于

getItem

MPI_Gather子矩阵到主处理器中的矩阵

0 个答案: