我正在尝试发送工具主从模式,其中master具有一个数组(充当作业队列)并将数据发送到从处理器。根据从主站获得的数据,从站计算结果并将答案返回给主站。掌握接收结果,找出接收到msg的从属等级,然后将下一个作业发送给该从属。
这是我实施的代码框架:
if (my_rank != 0)
{
MPI_Recv(&seed, 1, MPI_FLOAT, 0, tag, MPI_COMM_WORLD, &status);
//.. some processing
MPI_Send(&message, 100, MPI_FLOAT, 0, my_rank, MPI_COMM_WORLD);
}
else
{
for (i = 1; i < p; i++) {
MPI_Send(&A[i], 1, MPI_FLOAT, i, tag, MPI_COMM_WORLD);
}
for (i = p; i <= S; i++) {
MPI_Recv(&buf, 100, MPI_FLOAT, MPI_ANY_SOURCE, MPI_ANY_TAG,
MPI_COMM_WORLD, &status);
//.. processing to find out free slave rank from which above msg was received (y)
MPI_Send(&A[i], 1, MPI_FLOAT, y, tag, MPI_COMM_WORLD);
}
for (i = 1; i < p; i++) {
MPI_Recv(&buf, 100, MPI_FLOAT, MPI_ANY_SOURCE, MPI_ANY_TAG,MPI_COMM_WORLD, &status);
// .. more processing
}
}
如果我使用的是4处理器; 1是主人,3是奴隶;程序发送和接收作业队列中前3个作业的消息,但之后程序挂起。可能是什么问题呢?
答案 0 :(得分:0)
如果这是基于MPI的代码的总体,那么看起来您在客户端代码的外部缺少while
循环。我以前做过这个,我通常将其分解为taskMaster和peons
:
for (int i = 0; i < commSize; ++i){
if (i == commRank){ // commRank doesn't have to be 0
continue;
}
if (taskNum < taskCount){
// tasks is vector<Task>, where I have crated a Task
// class and send it as a stream of bytes
toSend = tasks.at(taskNum);
jobList.at(i) = taskNum; // so we no which rank has which task
taskNum += 1;
activePeons += 1;
} else{
// stopTask is a flag value to stop receiving peon
toSend = stopTask;
allTasksDistributed = true;
}
// send the task, with the size of the task as the tag
taskSize = sizeof(toSend);
MPI_Send(&toSend, taskSize, MPI_CHAR, i, taskSize, MPI_COMM_WORLD);
}
MPI_Status status;
while (activePeons > 0){
// get the results from a peon (but figure out who it is coming from and what the size is)
MPI_Probe(MPI_ANY_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD, &status);
MPI_Recv( &toSend, // receive the incoming task (with result data)
status.MPI_TAG, // Tag holds number of bytes
MPI_CHAR, // type, may need to be more robust later
status.MPI_SOURCE, // source of send
MPI_ANY_TAG, // tag
MPI_COMM_WORLD, // COMM
&status); // status
// put the result from that task into the results vector
results[jobList[status.MPI_SOURCE]] = toSend.getResult();
// if there are more tasks to send, distribute the next one
if (taskNum < taskCount ){
toSend = tasks.at(taskNum);
jobList[status.MPI_SOURCE] = taskNum;
taskNum += 1;
} else{ // otherwise send the stop task and decrement activePeons
toSend = stopTask;
activePeons -= 1;
}
// send the task, with the size of the task as the tag
taskSize = sizeof(toSend);
MPI_Send(&toSend, taskSize, MPI_CHAR, status.MPI_SOURCE, taskSize, MPI_COMM_WORLD);
}
在peon函数中:
while (running){
MPI_Probe(MPI_ANY_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD, &status); // tag holds data size
incoming = (Task *) malloc(status.MPI_TAG);
MPI_Recv( incoming, // memory location of input
status.MPI_TAG, // tag holds data size
MPI_CHAR, // type of data
status.MPI_SOURCE, // source is from distributor
MPI_ANY_TAG, // tag
MPI_COMM_WORLD, // comm
&status); // status
task = Task(*incoming);
if (task.getFlags() == STOP_FLAG){
running = false;
continue;
}
task.run(); // my task class has a "run" method
MPI_Send( &task, // string to send back
status.MPI_TAG, // size in = size out
MPI_CHAR, // data type
status.MPI_SOURCE, // destination
status.MPI_TAG, // tag doesn't matter
MPI_COMM_WORLD); // comm
free(incoming);
}
有一些bool
和int
值必须分配(正如我所说,我有一个Task类),但这给出了我认为你想做的基本结构