C,MPI线程通信问题

时间:2018-01-26 00:21:36

标签: c multithreading mpi

鉴于此结构:

typedef struct 
{
    double rx, ry, rz;
    double vx, vy, vz;
    double fx, fy, fz;
    double mass;
} Body;

我试图通过MPI多线程接口传递它。这是一个自定义结构,所以我创建了一个MPI类型:

int bodyParamas=10;
int blocklengths[10] = {1,1,1,1,1,1,1,1,1,1};
MPI_Datatype types[10] = {MPI_DOUBLE, MPI_DOUBLE, MPI_DOUBLE, MPI_DOUBLE, MPI_DOUBLE, MPI_DOUBLE, MPI_DOUBLE, MPI_DOUBLE, MPI_DOUBLE, MPI_DOUBLE};
MPI_Datatype mpi_body_type;
MPI_Aint     offsets[10];
offsets[0] = offsetof(Body, rx);
offsets[1] = offsetof(Body, ry);
offsets[2] = offsetof(Body, rz);
offsets[3] = offsetof(Body, vx);
offsets[4] = offsetof(Body, vy);
offsets[5] = offsetof(Body, vz);
offsets[6] = offsetof(Body, fx);
offsets[7] = offsetof(Body, fy);
offsets[8] = offsetof(Body, fz);
offsets[9] = offsetof(Body, mass);
MPI_Type_create_struct(bodyParamas, blocklengths, offsets, types, &mpi_body_type);
MPI_Type_commit(&mpi_body_type);

然后在我的for循环中,我发送数据,并在其他线程中回收它(与根目录不同):

        if(my_id == root_process) {
            int starting_bodies_array_index = -1;
            for(an_id = 1; an_id < num_procs; an_id++) {
                start_body_index = an_id*num_of_bodies_per_process + 1;
                end_body_index = (an_id + 1)*num_of_bodies_per_process;

                num_of_bodies_to_send = end_body_index - start_body_index + 1;
                starting_bodies_array_index += num_of_bodies_to_send;

                ierr = MPI_Send( &starting_bodies_array_index, 1 , MPI_INT,
                      an_id, send_data_tag, MPI_COMM_WORLD);

                ierr = MPI_Send( &bodies[starting_bodies_array_index], num_of_bodies_to_send, mpi_body_type,
                      an_id, send_data_tag, MPI_COMM_WORLD);
            }
        }
        else {

            ierr = MPI_Recv(&num_of_bodies_to_recive, 1, MPI_INT, 
                   root_process, send_data_tag, MPI_COMM_WORLD, &status);

            ierr = MPI_Recv(&bodiesRecived, num_of_bodies_to_recive, mpi_body_type, 
                   root_process, send_data_tag, MPI_COMM_WORLD, &status);

            num_of_bodies_recived = num_of_bodies_to_recive;

        }

我不知道我的代码有什么问题。我很确定,我的自定义MPI类型是正确的,错误中没有提到它。 这是我看到的错误:

*** An error occurred in MPI_Recv
*** reported by process [1580531713,1]
*** on communicator MPI_COMM_WORLD
*** MPI_ERR_TRUNCATE: message truncated
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)

有人看错了吗?

1 个答案:

答案 0 :(得分:2)

根本原因是MPI_Send() num_of_bodies_to_send元素,而是MPI_Recv() starting_bodies_array_index元素。

您应该用

替换第一个MPI_Send()
ierr = MPI_Send( &num_of_bodies_to_send, 1 , MPI_INT,
                 an_id, send_data_tag, MPI_COMM_WORLD);