MPI_Scatter Segfaulting

时间:2011-03-28 16:25:16

标签: c++ sorting parallel-processing mpi

我正在研究一个并行排序程序来学习MPI,我一直遇到MPI_Scatter问题。每次我试图跑,我得到以下:

reading input
Scattering input
_pmii_daemon(SIGCHLD): [NID 00012] PE 0 exit signal Segmentation fault
[NID 00012] 2011-03-28 10:12:56 Apid 23655: initiated application termination

对其他问题的基本看法并没有真正回答为什么我遇到麻烦 - 数组是连续的,所以我不应该有非连续的内存访问问题,而且我正在传递正确的指针正确的顺序。有没有人有任何想法?

源代码如下 - 它是针对特定数字指定的,因为我还不想处理变量输入和排名大小。

#include <mpi.h>

#include <iostream>
using std::endl;

using std::cout;

#include <fstream>
using std::ifstream;
using std::ofstream;
#include <algorithm>
using std::sort;

#define SIZEOF_INPUT 10000000
#define NUMTHREADS 100
#define SIZEOF_SUBARRAY SIZEOF_INPUT/NUMTHREADS

int main(int argc, char** argv){
    MPI_Init(&argc, &argv);

    int input[SIZEOF_INPUT];
    int tempbuf[SIZEOF_SUBARRAY];

    int myRank;
    MPI_Comm_rank(MPI_COMM_WORLD, &myRank);

    /*
            Read input from file
    */
    if(myRank == 0){
            cout << "reading input" << endl;
            ifstream in(argv[1]);
            for(int i = 0; i < SIZEOF_INPUT; ++i)
                    in >> input[i];
            cout << "Scattering input" << endl;
    }

    // Scatter, Sort, and Gather again    
    MPI_Scatter(input,SIZEOF_INPUT,MPI_INT,tempbuf,SIZEOF_SUBARRAY,MPI_INT,0,MPI_COMM_WORLD);
    cout << "Rank " << myRank << "Sorting" << endl;
    sort(tempbuf,tempbuf+SIZEOF_SUBARRAY);
    MPI_Gather(tempbuf,SIZEOF_SUBARRAY,MPI_INT,input,SIZEOF_INPUT,MPI_INT,0,MPI_COMM_WORLD);

    if(myRank == 0){
            cout << "Sorting final output" << endl;
            // I'm doing a multi-queue merge here using tricky pointer games

            //list of iterators representing things in the queue
            int* iterators[NUMTHREADS];
            //The ends of those iterators
            int* ends[NUMTHREADS];

            //Set up iterators and ends
            for(int i = 0; i < NUMTHREADS; ++i){
                    iterators[i] = input + (i*SIZEOF_SUBARRAY);
                    ends[i] = iterators[i] + SIZEOF_SUBARRAY;
            }

            ofstream out(argv[2]);
            int ULTRA_MAX = SIZEOF_INPUT + 1;
            int* ULTRA_MAX_POINTER = &ULTRA_MAX;
            while(true){
                    int** curr_min = &ULTRA_MAX_POINTER;
                    for(int i = 0 ; i < NUMTHREADS; ++i)
                            if(iterators[i] < ends[i] && *iterators[i] < **curr_min)
                                    curr_min = &iterators[i];

                    if(curr_min == &ULTRA_MAX_POINTER) break;

                    out << **curr_min << endl;
                    ++(*curr_min);
            }
    }

    MPI_Finalize();
}

非常感谢任何帮助。 问候, 扎克

2 个答案:

答案 0 :(得分:3)

哈!我花了一段时间才看到这个。

诀窍是,在MPI_Scatter中,sendcount是发送到每个进程的金额,而不是总计。与聚集相同;这是每个人收到的金额。也就是说,它就像MPI_Scatterv一样有计数;计数是针对每个过程,但在这种情况下,它被认为是相同的。

所以这个

MPI_Scatter(input,SIZEOF_SUBARRAY,MPI_INT,tempbuf,SIZEOF_SUBARRAY,MPI_INT,0,MPI_COMM_WORLD);
cout << "Rank " << myRank << "Sorting" << endl;
MPI_Gather(tempbuf,SIZEOF_SUBARRAY,MPI_INT,input,SIZEOF_SUBARRAY,MPI_INT,0,MPI_COMM_WORLD);

适合我。

另外,小心在堆栈上分配大型数组;我知道这只是一个示例问题,但对我而言,这会导致崩溃。动态地做它

int *input = new int[SIZEOF_INPUT];
int *tempbuf = new int[SIZEOF_SUBARRAY];
//....
delete [] input;
delete [] tempbuf;

解决了这个问题。

答案 1 :(得分:1)

int* iterators[NUMTHREADS];
//The ends of those iterators
int* ends[NUMTHREADS];

//Set up iterators and ends
 for(int i = 0; i < NUMTHREADS; ++i){
       iterators[i] = input + (i*SIZEOF_SUBARRAY); // problem
       ends[i] = iterators[i] + SIZEOF_SUBARRAY;   // problem
 }

两个iterators and ends都是指向no where或garbage的整数指针数组。但是在for循环中试图将值保持为指向某个位置,这会导致分段错误。程序应该首先分配内存,迭代器可以指向,然后应该将值保存在它们指向的位置。

for( int i=0 ; i < NUMTHREADS; ++i )
{
     iterators[i] = new int; 
     end[i] = new int ; 
}
// Now do the earlier operation which caused problem

由于程序管理资源(即从new获取),因此当不再需要时,它应使用delete[]将资源返回到免费存储。使用std :: vector而不是管理自己的资源,这很容易。