多次调用MPI_Bcast是否错误?

时间:2019-05-08 20:37:28

标签: c++ mpi

我正在尝试使用MPI(即nxn矩阵与nx1向量相乘)实现矩阵向量乘法。

最初,我决定使用多个MPI_Bcast调用(在我注意到MPI_AllGather ...之前),我偶然发现了一些奇怪的行为。显然,无论传递给MPI_Bcast呼叫的等级如何,都可以接收数据。

所用代码的一部分(功能彼此后接,因此发送广播是在接收广播之前)。打印只是出于调试目的,我知道测试数据的长度为2:

class Processor
{
public:
    Processor(int rank, int communicatorSize);

private:
    void broadcastOwnVectorToOtherRanks();
    void receiveBroadcastsFromOtherRanks();
    //...

    int ownRank;
    int communicatorSize;
    std::vector<int> ownVectorPart;
    std::vector<int> totalVector;
    //...
};

void Processor::broadcastOwnVectorToOtherRanks()
{
    //ownVectorPart is correctly filled before this function call
    std::printf("Own data in vector %d %d\n", ownVectorPart[0], ownVectorPart[1]);
    MPI_Bcast(ownVectorPart.data(), ownVectorPart.size(), MPI_INT, ownRank, MPI_COMM_WORLD);
}

void Processor::receiveBroadcastsFromOtherCommunicators()
{
    for (int rank = 0; rank < communicatorSize; ++rank)
    {
        if (rank == ownRank)
        {
            totalVector.insert(totalVector.end(), ownVectorPart.begin(), ownVectorPart.end());
        }
        else
        {
            std::vector<int> buffer(ownVectorPart.size());
            MPI_Bcast(buffer.data(), ownVectorPart.size(), MPI_INT, rank, MPI_COMM_WORLD);
            std::printf("Received from process with rank %d: %d %d\n", rank, buffer[0], buffer[1]);
            totalVector.insert(totalVector.end(), buffer.begin(), buffer.end());
        }
    }
}

结果(按排名排序):

[0] Own data in vector 0 1
[0] Received from communicator 1: 6 7
[0] Received from communicator 2: 4 5
[0] Received from communicator 3: 2 3
[1] Own data in vector 2 3
[1] Received from communicator 0: 0 1
[1] Received from communicator 2: 4 5
[1] Received from communicator 3: 6 7
[2] Own data in vector 4 5
[2] Received from communicator 0: 0 1
[2] Received from communicator 1: 2 3
[2] Received from communicator 3: 6 7
[3] Own data in vector 6 7
[3] Received from communicator 0: 4 5
[3] Received from communicator 1: 2 3
[3] Received from communicator 2: 0 1

您可以看到,在等级0和3的过程中,接收到的数据与发送的数据不同。例如,排名为0的进程接收到的排名为3的数据,即使它期望从进程1获得数据。

在我看来,当接收广播数据时,等级将被忽略,并且MPI会根据其来分配数据,无论它是否来自预期等级。

当作为参数传递的等级为MPI_Bcast时,为什么3从等级为1的进程接收数据?多次调用MPI_Bcast是一种不确定的行为吗?还是我的代码中有错误?

1 个答案:

答案 0 :(得分:0)

引用MPI 3.1标准(第5.12节):

  

所有进程都必须调用集体操作(阻塞和非阻塞)   每个沟通者的顺序相同。特别是一旦流程调用   集体行动,沟通者的所有其他程序必须   最终称为同一集体行动,没有   两者之间使用同一通信器进行的其他集体操作。

将其与5.4节合并:

  

如果comm是内部通信者,则MPI_BCAST广播来自   该过程本身是该组所有过程的基础   包括在内。该组的所有成员都使用相同的名称   comm和root的参数。

我将这两部分解释为意味着您必须在所有进程上以相同的顺序调用MPI_Bcast和类似的集合通信函数,并使用相同的参数。用不同的根值调用无效。

我相信MPI_Allgather更适合您似乎想要的交流方式。它从所有进程收集相等数量的数据,并将其复制到每个进程。