C中的MPI程序崩溃

时间:2014-04-29 03:26:14

标签: c mpi

我的程序正在运行并在某些时候崩溃。在仔细研究代码之后,我得出结论,我不知道为什么会弄清楚。有人可以提供一些帮助吗?下面是main()。我很乐意发布其他源文件,如果你问,只是不想发布太多。

谢谢,斯科特

int main(int argc, char *argv[])
{
//Global data goes here
    int rank, nprocs, i, j, k, rc, chunkSize; 
    double start, finish, difference;
    MPI_Status status;
    int *masterArray;
    int *slaveArray;
    int *subArray; 
    //Holder for subArrays for reassembly of subArrays
    int **arrayOfArrays; 
    //Beginning and ARRAYSIZE indices of array 
    Range range;

    //Begin execution

    //printf("%s", "Entering main()\n");
    MPI_Init(&argc, &argv); /* START MPI */

    /* DETERMINE RANK OF THIS PROCESSOR */
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    //printf("My rank %d\n", rank);

    /* DETERMINE TOTAL NUMBER OF PROCESSORS */
    MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
    //printf("Number of processes %d\n", nprocs);

    //Compute chunk size
    chunkSize = computeChunkSize(ARRAYSIZE, nprocs);
    //debug("%s: %d\n", "Chunk size", chunkSize);
    //                      N/#processes
    slaveArray = (int *)malloc(sizeof(int) * (chunkSize+1)); 

    //An array of int arrays (a pointer to pointers to ints)   
    arrayOfArrays = (int **)malloc(sizeof(int *) * (nprocs-1));

    /****************************************************************
     ****************************************************************
     ************************ MASTER id == 0 ************************
     ****************************************************************
     ***************************************************************/

    /* MASTER: rank is 0. Problem decomposition- here simple matter of splitting 
    the master array evenly across the number of worker bees */
    if(rank == MASTER)
    {
        debug("%s", "Entering MASTER process\n");

        //Begin timing the runtime of this application
        start = MPI_Wtime();
        debug("%s: %lg\n", "Start time", start);

        //Seed the random number generator
        srand(time(NULL));
        //Create random array of ints for mpi processing        
        masterArray = createRandomArray();

        debug("%s %d %s %d %s\n", "Master array of random integers from ", BEGIN, " to ", ARRAYSIZE-1, "\n");

        /*Create the subArray to be sent to the slaves- malloc returns a pointer 
        to void, so explicitly coerce the pointer into the desired type with a cast */
        subArray = (int *)malloc(sizeof(int) * (chunkSize+1)); 

        //Initalize range
        range = (Range){.begin = 0, .end = (ARRAYSIZE/(nprocs-1))};  
        debug("%s %d %s %d\n", "Range: ", range.begin, " to ", range.end);

        //Master decomposes the problem set: begin and end of each subArray sent to slaves
        for(i = 1;i < nprocs; i++)
        {
            //printf("%s", "Inside loop for Master send\n");

            range = decomposeProblem(range.begin, range.end, ARRAYSIZE, nprocs, i);

            debug("%s %d to %d%s", "Range from decomposition", range.begin, range.end, "\n");
            //Index for subArray
            k = 0;

            //Transfer the slice of the master array to the subArray
            for(j = range.begin; j < range.end; j++)
            {    
                subArray[k] = masterArray[j];
                //printf("%d\t", subArray[k]);
                k++;   
            }
            //printf("%s", "\n");
            //Show sub array contents
            debug("%s", "Showing subArray before master sends...\n");
            showArray(subArray, 0, k);

            //printf("%s %d%s", "Send to slave", i, " from master \n");
            debug("%s %d%s", "Send to slave", i, " from master \n");            
            /***************************************************************
            ****************************************************************
            ************************ MASTER: SEND **************************
            ****************************************************************
            ***************************************************************/
            //MPI_Send(buffer,count,type,dest,tag,comm)                 
            rc = MPI_Send(&subArray, chunkSize, MPI_INT, i, 0, MPI_COMM_WORLD);
        }
        //Blocks until the slaves finish their work and start sending results back to master
        /*MPI_Recv is "blocking" in the sense that when the process (in this case 
        my_rank == 0) reaches the MPI_Recv statement, it will wait until it 
        actually receives the message (another process sends it). If the other process 
        is not ready to Send, then the process running on my_rank == 0 will simply 
        remain idle. If the message is never sent, my_rank == 0 will wait a very long time!*/
        for(i = 1;i < nprocs; i++)
        {
            debug("%s %d%s ", "Receive from slave", i, " to master\n");         
            /***************************************************************
            ****************************************************************
            ************************ MASTER: RECEIVE ***********************
            ****************************************************************
            ***************************************************************/
            debug("Rank %d approaching master MPI_Probe.\n", rank);
            // Probe for an incoming message from process zero
            MPI_Probe(rank, 0, MPI_COMM_WORLD, &status);
            debug("Rank %d going by MPI_Probe.\n", rank);

            // When probe returns, the status object has the size and other
            // attributes of the incoming message. Get the size of the message
            MPI_Get_count(&status, MPI_INT, &chunkSize);

            rc = MPI_Recv(&slaveArray, chunkSize, MPI_INT, i, 0, MPI_COMM_WORLD, &status);

            debug("Slave %d dynamically received %d numbers from 0.\n", rank, chunkSize);
            //Store subArray in 2D array
            debug("%s", "Storing subArray in 2DArray...\n");

            arrayOfArrays[i-1] = slaveArray;
        }
        //rebuild entire sorted array from sorted subarrays
        reconstructArray(arrayOfArrays);
        //starting with smallest value, validate that each element is <= next element
        validateArray(arrayOfArrays);

        //Finish timing the runtime of this application 
        finish = MPI_Wtime();
        //Compute the runtime
        difference = finish-start;
        //Inform user
        debug("%s", "Exiting MASTER process\n");
        debug("%s %lg", "Time for completion:", difference);
    }
    /****************************************************************
     ****************************************************************
     ************************* End MASTER ***************************
     ****************************************************************
     ***************************************************************/

    /****************************************************************
     ****************************************************************
     ************************ SLAVE id > 1 **************************
     ****************************************************************
     ***************************************************************/
    else
    {
        debug("%s", "Entering SLAVE process\n");
        //by process id
        debug("%s %d%s", "Receive in slave", rank, " from master \n");

        debug("Rank %d approaching Slave MPI_Probe.\n", rank);
        // Probe for an incoming message from process zero

        MPI_Probe(MPI_ANY_SOURCE, 0, MPI_COMM_WORLD, &status);
        debug("Rank %d going by Slave MPI_Probe.\n", rank);
        // When probe returns, the status object has the size and other
        // attributes of the incoming message. Get the size of the message
        MPI_Get_count(&status, MPI_INT, &chunkSize);
        debug("Count %d and chunkSize %d after Slave MPI_Get_count.\n", rank, chunkSize);
        /***************************************************************
         ***************************************************************
         ******************** SLAVE: RECEIVE ***************************
         ***************************************************************
         ***************************************************************/
        rc = MPI_Recv(&subArray, chunkSize, MPI_INT, 0, 0, MPI_COMM_WORLD, &status);
        debug("%d dynamically received %d numbers from 0.\n", rank, chunkSize);

        /*Store the received subArray in the slaveArray for processing and sending back
            to master*/ 
        slaveArray = subArray;

        //Take a look at incoming subArray: size = N/#processes)
        debug("%s ", "Show the slaveArray contents in slave receive\n");
        debug("Before bubblesort: start %d, finish: %d\n", (rank-1) * chunkSize, rank * chunkSize);

        //showArray(slaveArray, (rank-1) * chunkSize, rank * chunkSize);
        //Running the actual sorting algorithm on the current slaves subArray
        //bubble(slaveArray, ARRAYSIZE);
        //Return sorted subArray back to the master by process id

        debug("%s %d%s", "Send from slave", i, " to master \n");

        /***************************************************************
         ****************************************************************
         ************************ SLAVE: SEND ***************************
         ****************************************************************
         ***************************************************************/
        //MPI_Send(buffer,count,type,dest,tag,comm) 
        rc = MPI_Send(&slaveArray, chunkSize, MPI_INT, 0, 0, MPI_COMM_WORLD);
        debug("%s", "Exiting SLAVE process\n");
    }
    /****************************************************************
     ****************************************************************
     ************************* END SLAVE ****************************
     ****************************************************************
     ***************************************************************/
    //Clean up memory
    //free(subArray);
    //free(masterArray);
    //free(slaveArray);
    //free(arrayOfArrays);
    rc = MPI_Get_count(&status, MPI_INT, &chunkSize);
    debug("Process %d: received %d int(s) from process %d with tag %d \n", rank, chunkSize, status.MPI_SOURCE, status.MPI_TAG);
    /* EXIT MPI */
    MPI_Finalize();
    debug("%s", "Exiting main()\n");
    return 0;
}

2 个答案:

答案 0 :(得分:1)

检查chunkSize >= 0nProcs >= 2malloc不返回null。我的意思是,每次和每个malloc都添加代码来执行此操作,如果这些条件不正确则退出 - 不只是进行临时调试。

此循环可能溢出边界:

for(j = range.begin; j < range.end; j++)
{    
    subArray[k] = masterArray[j];
    k++;   
}

您没有显示分配masterArray的代码。 (并且你没有将nprocs传递给该函数,那么它如何与ARRAYSIZE/(nprocs-1)匹配?

此外,subArraychunkSize+1个元素,但range.end定义为ARRAYSIZE/(nprocs-1)。根据您显示的代码(不包括ARRAYSIZE,以及chunkSizenprocs的实际计算方式),没有理由相信我们将始终拥有chunkSize+1 <= ARRAYSIZE/(nprocs-1)

为了避免随机段错误,在使用[]运算符之前,您应始终始终检查数组索引是否在数组的范围内。

答案 1 :(得分:0)

好吧,也许在代码中显示特定时刻以帮助我解决这个问题会更容易。我试图创建一个函数来创建一个通过引用传入的int *数组,该数组测试数组是否为null以及它是否是我想要的大小。下面就是来电者。我注意到的一件事是sizeof(缓冲区)调用没有返回我的想法。那么,我怎么能做那个检查呢?此外,调用者createRandomArray通过传入int *来调用。您可以根据需要通过参考传递吗?我是否使用正确的语法来确保masterArray通过引用调用在调用者(main())中填充?

void safeMalloc(int *buffer, int size, int line_num)
{
    buffer = (int *)malloc(sizeof(int) * size);
    //Test that malloc allocated at least some memory
    if(buffer == NULL) 
    {
        debug("ERROR: cannot allocate any memory for line %d\n", line_num);
        perror(NULL);
        exit(EXIT_FAILURE);
    }
    else
        debug("Successfully created the array through malloc()\n");
    //Test that malloc allocated the correct amount of memory
    if(sizeof(buffer) != size)
    {
        debug("ERROR: Created %d bytes array instead of %d bytes through malloc() on line %d.\n", sizeof(buffer), size, line_num);
        perror(NULL);
        exit(EXIT_FAILURE);
    }
}

void createRandomArray(int *masterArray)
{
    int i;
    debug("Entering createRandomArray()\n");
    safeMalloc(masterArray, ARRAYSIZE, 21);
    for(i = BEGIN;i < ARRAYSIZE;i++)
    {
        masterArray[i] = (rand() % (ARRAYSIZE - BEGIN)) + BEGIN;
        debug("%d ", masterArray[i]);
    }
    debug("\n");
    debug("\n Exiting createRandomArray()\n");
}