当多个进程初始化时,为什么这个程序会卡住?

时间:2011-12-10 23:22:04

标签: c mpi

该程序通过随机投掷飞镖"来估计Pi。 (采样点)到圆形或半径= 1内接在长度为2的方形板内。使用关系

Area of circle / Area of Square = Pi/4

我们可以使用表示为

的相同关系来估计Pi
Darts Inside Circle / Darts Outside Circle = Pi/4

当我在NDARTS中指定#define时,该程序正常运行。但是,当我将NDARTS指定为值that's read via scanf and then broadcasted时,当通过mpirun分配了多个进程时,它会被卡住:

mpirun -np 1 ./pi_montecarlo.x

   Monte Carlo Method to estimate Pi 

Introduce Number of Darts 
10000
  Number of processes: 1 
  Number of darts: 10000 
Known value of PI  : 3.1415926535 
Estimated Value of PI  : 3.1484000000
Error Percentage   : 0.21668457
Time    : 0.00060296



mpirun -np 2 ./pi_montecarlo.x

Monte Carlo Method to estimate Pi 

Introduce Number of Darts 
10000
Number of processes: 2 
Number of darts: 10000 

^被困在这里。

为什么呢?这是一些特定于mpi实现的问题吗?我应该尝试另一个MPI实现(我想我正在运行lam)?你能在自己的盒子上至少运行2个进程吗?

/*
mpicc -g -Wall -lm pi_montecarlo3.c -o pi_montecarlo.x 

mpirun -np 4 ./pi_montecarlo.x
*/

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <time.h>
#include <mpi.h>

#define MASTER 0
#define PI 3.1415926535

double pseudo_random (double a, double b) {
    double r; 
    r = ((b-a) * ((double) rand() / (double) RAND_MAX)) +a;
    return r; 
}

int main(int argc, char*argv[]){
    long long int NDARTS;

    int proc_id, 
        n_procs, 
        llimit,  
        ulimit,  
        n_circle, 
        i;      


    double pi_current, 
           pi_sum,     
           x,         
           y,         
           z,          
           error,      
           start_time, 
           end_time;   

    struct timeval stime;

    llimit = -1;
    ulimit = 1;
    n_circle =0; 

    MPI_Init(&argc, &argv); 

    MPI_Comm_rank (MPI_COMM_WORLD, &proc_id);
    MPI_Comm_size (MPI_COMM_WORLD, &n_procs);

    if (proc_id == MASTER){
        printf("\nMonte Carlo Method to estimate Pi \n\n");

            printf("Introduce Number of Darts \n");

            scanf("%lld",&NDARTS); 

        printf("  Number of processes: %d \n", n_procs);
        printf("  Number of darts: %lld \n", NDARTS);

            MPI_Bcast(&NDARTS, 1, MPI_LONG_LONG_INT, 0, MPI_COMM_WORLD);

            start_time = MPI_Wtime();
    }

    gettimeofday(&stime, NULL); 
    srand(stime.tv_usec * stime.tv_usec * stime.tv_usec * stime.tv_usec);

    for (i=1; i<=NDARTS;i++){
        x = pseudo_random(llimit, ulimit);
        y = pseudo_random(llimit, ulimit);

        z = pow(x,2) + pow(y,2);

        if (z<=1.0){
            n_circle++;
        }
    }

    pi_current = 4.0 * (double)n_circle / (double) NDARTS; 

    MPI_Reduce (&pi_current, &pi_sum, 1, MPI_DOUBLE, MPI_SUM, MASTER, MPI_COMM_WORLD);

       if (proc_id == MASTER) {
        pi_sum = pi_sum / n_procs;

        error = fabs ((pi_sum -PI) / PI) *100;

        end_time = MPI_Wtime();

        printf("Known value of PI  : %11.10f \n", PI);
        printf("Estimated Value of PI  : %11.10f\n", pi_sum);
        printf("Error Percentage   : %10.8f\n", error);
        printf("Time    : %10.8f\n\n", end_time - start_time);

    }

    MPI_Finalize();

    return 0;
}

1 个答案:

答案 0 :(得分:1)

广播不会将数据“推送”到其他处理器上。

几乎所有MPI通信都需要所有处理器的积极参与。例如,要在两个处理器之间发送消息,发件人必须调用类似MPI_Send()的内容,并且接收方必须调用类似MPI_Recv()的内容。

集体沟通也是如此;例如,你们每个人都在呼叫MPI_Reduce()。同样,你必须让所有人调用MPI_Bcast(),而不仅仅是拥有原始数据的那个,也就是“接收者”:

if (proc_id == MASTER){
    /* ... */
    scanf("%lld",&NDARTS); 
}

MPI_Bcast(&NDARTS, 1, MPI_LONG_LONG_INT, 0, MPI_COMM_WORLD);

if (proc_id == MASTER) {
    start_time = MPI_Wtime();
}

/* ... */

顺便说一下,当你为你的随机数生成器播种时,你可能想要确定每个处理器上的种子是不同的,只要将proc_id放在那里而不仅仅依靠时钟的不同来甩掉种子...