使用MPI的Master-Slave模型中的死锁

时间:2016-01-09 11:45:06

标签: c++ synchronization mpi master-slave

我正在尝试使用MPI实现主/从模型,但我有一点问题。

我想要做的是,奴隶应该等待主人的命令,他们不应该工作,直到主人发出命令。船长应该同时向所有奴隶发送订单,等待所有奴隶完成订单,然后再次向所有奴隶发送订单。

例如,我有3个处理器(1个主设备,2个从设备),我向奴隶发送两次订单,我想要打印:

Master initialization done.
Master sends order to slave 1
Master sends order to slave 2
Slave 1 got the order from master
Slave 2 got the order from master
Master got response from Slave 1
Master got response from Slave 2
_________________________________
Master sends order to slave 1
Master sends order to slave 2
Slave 1 got the order from master
Slave 2 got the order from master
Master got response from Slave 1
Master got response from Slave 2
All done.

这是我到目前为止所做的。

int count = 0;
int number;
if (procnum == 0) {
    // initialize master, slaves shouldn't be working until this ends
    std::cout << "Master initialization done." << endl;
    while (count < 2) {
        for (int i = 1; i < numprocesses; i++) {
            number = i * 2;
            std::cout << "Master sends order to slave " << i << endl;
            MPI_Send(&number, 1, MPI_INT, i, 0, MPI_COMM_WORLD);
            MPI_Recv(&number, 1, MPI_INT, i, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
            std::cout << "Master got response from Slave " << i << endl;
        }
        count++;
    }
    std::cout << "All done" << endl;
} else {
    int received;
    MPI_Recv(&received, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
    std::cout << "Slave " << procnum << " got the order from master" << endl;
    MPI_Send(&received, 1, MPI_INT, 0, 0, MPI_COMM_WORLD);
}

但我得到了这个:

Master initialization done.
Master sends order to slave 1
Slave 1 got the order from master
Master got response from Slave 1
Master sends order to slave 2
Slave 2 got the order from master
Master got response from Slave 2
Master sends order to slave 1

然后它被卡住了。我做错了什么?

1 个答案:

答案 0 :(得分:0)

for (int i = 1; i < size; i++) {

应该是

for (int i = 1; i <= size; i++) {

编辑:没关系,因为size是3(包括服务器)

关于序列:MPI_Send和MPI_Recv是阻塞调用,因此输出符合预期(?)。

如果主机在第二轮被阻止,那是因为从机没有响应。 while (count < 2)循环应该包装主服务器和从服务器。