Question

我使用了一些实现手动MPI广播的代码，基本上是一个从根到所有其他节点单播整数的演示。当然，对许多节点进行单播比MPI_Bcast()效率低，但我只是想检查一下它是如何工作的。

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>

void my_bcast(void* data, int count, MPI::Datatype datatype, int root, MPI::Intracomm communicator) {
    int world_size = communicator.Get_size();
    int world_rank = communicator.Get_rank();

    if (world_rank == root) {
        // If we are the root process, send our data to everyone
        int i;
        for (i = 0; i < world_size; i++) {
            if (i != world_rank) {
                communicator.Send(data, count, datatype, i, 0);
            }
        }
    } else {
        // If we are a receiver process, receive the data from the root
        communicator.Recv(data, count, datatype, root, 0);
    }
}

int main(int argc, char** argv) {
    MPI::Init();

    int world_rank = MPI::COMM_WORLD.Get_rank();

    int data;
    if (world_rank == 0) {
        data = 100;
        printf("Process 0 broadcasting data %d\n", data);
        my_bcast(&data, 1, MPI::INT, 0, MPI::COMM_WORLD);
    } else {
        my_bcast(&data, 1, MPI::INT, 0, MPI::COMM_WORLD);
        printf("Process %d received data %d from root process\n", world_rank, data);
    }

    MPI::Finalize();
}

我注意到，如果我删除了root根本不发送给自己的检查，

if (i != world_rank) {
...
}

程序仍然有效并且不会阻塞，而MPI_Send()的默认行为应该是阻塞，即等待直到另一端收到数据。但是MPI_Recv()永远不会被根调用。有人可以解释为什么会这样吗？

我使用以下命令从根运行代码（集群在Amazon EC2上设置，并使用NFS作为节点之间的共享存储，并且所有计算机都安装了Open MPI 1.10.2）

mpirun -mca btl ^openib -mca plm_rsh_no_tree_spawn 1 /EC2_NFS/my_bcast

C文件使用

编译

mpic++ my_bcast.c

和mpic++版本为5.4.0。

代码取自www.mpitutorial.com

Answer 1

您误认为阻止同步行为。阻止意味着在操作完成之前调用不会返回。一旦提供的缓冲区可由程序重用，标准发送操作（MPI_Send）就完成了。这意味着消息完全传输到接收器或者它由MPI库在内部存储以便以后传送（缓冲发送）。缓冲行为是特定于实现的，但大多数库将缓冲大小为单个整数的消息。使用MPI_Ssend（或等效的C ++）强制同步模式让程序挂起。

请注意，C ++ MPI绑定不再是标准的一部分，不应用于开发新软件。请改用C绑定MPI_Blabla。

当MPI_Send没有阻止时

1 个答案: