我正在研究一个并行排序程序来学习MPI,我一直遇到MPI_Scatter问题。每次我试图跑,我得到以下:
reading input
Scattering input
_pmii_daemon(SIGCHLD): [NID 00012] PE 0 exit signal Segmentation fault
[NID 00012] 2011-03-28 10:12:56 Apid 23655: initiated application termination
对其他问题的基本看法并没有真正回答为什么我遇到麻烦 - 数组是连续的,所以我不应该有非连续的内存访问问题,而且我正在传递正确的指针正确的顺序。有没有人有任何想法?
源代码如下 - 它是针对特定数字指定的,因为我还不想处理变量输入和排名大小。
#include <mpi.h>
#include <iostream>
using std::endl;
using std::cout;
#include <fstream>
using std::ifstream;
using std::ofstream;
#include <algorithm>
using std::sort;
#define SIZEOF_INPUT 10000000
#define NUMTHREADS 100
#define SIZEOF_SUBARRAY SIZEOF_INPUT/NUMTHREADS
int main(int argc, char** argv){
MPI_Init(&argc, &argv);
int input[SIZEOF_INPUT];
int tempbuf[SIZEOF_SUBARRAY];
int myRank;
MPI_Comm_rank(MPI_COMM_WORLD, &myRank);
/*
Read input from file
*/
if(myRank == 0){
cout << "reading input" << endl;
ifstream in(argv[1]);
for(int i = 0; i < SIZEOF_INPUT; ++i)
in >> input[i];
cout << "Scattering input" << endl;
}
// Scatter, Sort, and Gather again
MPI_Scatter(input,SIZEOF_INPUT,MPI_INT,tempbuf,SIZEOF_SUBARRAY,MPI_INT,0,MPI_COMM_WORLD);
cout << "Rank " << myRank << "Sorting" << endl;
sort(tempbuf,tempbuf+SIZEOF_SUBARRAY);
MPI_Gather(tempbuf,SIZEOF_SUBARRAY,MPI_INT,input,SIZEOF_INPUT,MPI_INT,0,MPI_COMM_WORLD);
if(myRank == 0){
cout << "Sorting final output" << endl;
// I'm doing a multi-queue merge here using tricky pointer games
//list of iterators representing things in the queue
int* iterators[NUMTHREADS];
//The ends of those iterators
int* ends[NUMTHREADS];
//Set up iterators and ends
for(int i = 0; i < NUMTHREADS; ++i){
iterators[i] = input + (i*SIZEOF_SUBARRAY);
ends[i] = iterators[i] + SIZEOF_SUBARRAY;
}
ofstream out(argv[2]);
int ULTRA_MAX = SIZEOF_INPUT + 1;
int* ULTRA_MAX_POINTER = &ULTRA_MAX;
while(true){
int** curr_min = &ULTRA_MAX_POINTER;
for(int i = 0 ; i < NUMTHREADS; ++i)
if(iterators[i] < ends[i] && *iterators[i] < **curr_min)
curr_min = &iterators[i];
if(curr_min == &ULTRA_MAX_POINTER) break;
out << **curr_min << endl;
++(*curr_min);
}
}
MPI_Finalize();
}
非常感谢任何帮助。 问候, 扎克
答案 0 :(得分:3)
哈!我花了一段时间才看到这个。
诀窍是,在MPI_Scatter
中,sendcount是发送到每个进程的金额,而不是总计。与聚集相同;这是每个人收到的金额。也就是说,它就像MPI_Scatterv
一样有计数;计数是针对每个过程,但在这种情况下,它被认为是相同的。
所以这个
MPI_Scatter(input,SIZEOF_SUBARRAY,MPI_INT,tempbuf,SIZEOF_SUBARRAY,MPI_INT,0,MPI_COMM_WORLD);
cout << "Rank " << myRank << "Sorting" << endl;
MPI_Gather(tempbuf,SIZEOF_SUBARRAY,MPI_INT,input,SIZEOF_SUBARRAY,MPI_INT,0,MPI_COMM_WORLD);
适合我。
另外,小心在堆栈上分配大型数组;我知道这只是一个示例问题,但对我而言,这会导致崩溃。动态地做它
int *input = new int[SIZEOF_INPUT];
int *tempbuf = new int[SIZEOF_SUBARRAY];
//....
delete [] input;
delete [] tempbuf;
解决了这个问题。
答案 1 :(得分:1)
int* iterators[NUMTHREADS];
//The ends of those iterators
int* ends[NUMTHREADS];
//Set up iterators and ends
for(int i = 0; i < NUMTHREADS; ++i){
iterators[i] = input + (i*SIZEOF_SUBARRAY); // problem
ends[i] = iterators[i] + SIZEOF_SUBARRAY; // problem
}
两个iterators and ends
都是指向no where或garbage的整数指针数组。但是在for循环中试图将值保持为指向某个位置,这会导致分段错误。程序应该首先分配内存,迭代器可以指向,然后应该将值保存在它们指向的位置。
for( int i=0 ; i < NUMTHREADS; ++i )
{
iterators[i] = new int;
end[i] = new int ;
}
// Now do the earlier operation which caused problem
由于程序管理资源(即从new
获取),因此当不再需要时,它应使用delete[]
将资源返回到免费存储。使用std :: vector而不是管理自己的资源,这很容易。