Question

我有一个超过4个节点的分布式矩阵，我希望每个节点发送其部分矩阵，并且一次一个地接收来自其他节点的矩阵的每个其他部分。块矩阵具有不同的尺寸。

我尝试编写一些代码，但它没有按预期工作。

  /* send my part of the matrix */
  for (int i = 0; i < numtasks; i++){
    if (i == taskid) continue;

    MPI_Isend(matrix_block, size, MPI_INT, i, 0,
              MPI_COMM_WORLD, &rNull);
  }

  /* receive everyone's part of the matrix */
  for (int i = 0; i < numtasks; i++){
    if (i == taskid) continue;

    MPI_Irecv(brec, lenghts_recv[i], MPI_INT, i, 0,
              MPI_COMM_WORLD, &request[i]);
  }

  for (int i = 0; i < numtasks - 1; i++){
    int index;
    MPI_Waitany(numtasks-1, request, &index, &status);
  }

我认为每个节点都会首先发送它所拥有的块，然后它将接收其他节点发送给他的内容，但显然它是错误的。

此外，像MPI_Alltoall这样的解决方案在我的情况下不起作用，因为它应该是巨大的矩阵并且它不适合一个节点。

你能建议我一种方法来执行所有操作，但一次只使用一部分矩阵吗？

Answer 1

您可以使用MPI_Bcast让四个节点中的每个节点将其部分矩阵发送给其他三个节点。这样，您可以将all-to-all操作拆分为几个一对一操作，您可以将这些操作与计算交错。

基本上，你可以这样做：

 for (int i = 0; i < numtasks; i++){
     /* process i sends the data in matrix_block to all other processes. This is a 
        collective operation, i.e., after the operation, every process will have 
        already received the data into matrix_block. */
     MPI_Bcast(matrix_block, size, MPI_INT, i, MPI_COMM_WORLD);

     //TODO: do all necessary computation on this part of the matrix */
}

我不确定你的代码是如何工作的以及所有变量是什么，所以我不能给你更具体的东西。如果您使用最少的工作示例更新您的问题，我可能会提供更多帮助。

您可以在this excellent answer中找到使用MPI_Bcast的示例。

MPI所有操作都是一次一个巨大矩阵的一部分

1 个答案: