MPI_Isend和MPI_Wait导致大矩阵的分段错误

时间:2014-04-10 20:31:02

标签: mpi large-data

代码只为矩阵分配内存,并使用非阻塞程序将矩阵从0级发送到1级。对于较小的矩阵大小(1024),它可以正常工作。但它会导致更大尺寸的分段错误(16384); 以下是代码

    double **A;
    int i,j,size,rankid,rankall;
    size = 16384;
    MPI_Request reqr,reqs;
    MPI_Status star,stas;
    MPI_Init(&argc, &argv);
    MPI_Comm_size(MPI_COMM_WORLD,&rankall);
    MPI_Comm_rank(MPI_COMM_WORLD,&rankid);
    A = (double**)calloc(size,sizeof(double*));
    for(i=0;i<size;i++){
            A[i] = (double *)calloc(size,sizeof(double));
            for(j=0;j<size;j++){
                    if(rankid ==0){
                            A[i][j] = 1;
                    }
            }
    }
    if(rankid ==0){
            MPI_Isend(&A[0][0],size*size,MPI_DOUBLE,1,1,MPI_COMM_WORLD,&reqs);
            MPI_Wait(&reqs,&stas);
    }
    if(rankid ==1){
            MPI_Irecv(&A[0][0],size*size,MPI_DOUBLE,0,1,MPI_COMM_WORLD,&reqr);
            MPI_Wait(&reqr,&star);
    }

    MPI_Finalize();

debug shows

#0 0x00007FFFF7947093 in ?? () From /1ib/x86_64-1inux-gnu/libc.so.6
#1 0x000000000043a5B0 in MPID_Segment_contig_m2m ()
#2 0x00000000004322cb in MPID_Segment_manipulate ()
#3 0x000000000043a?Ba in MPID_Segment_pack ()
#4 0x000000000042BB99 in lmt_shm_send_progress ()
#5 0x000000000042?e1F in MPID_nem_lmt_shm_start_send ()
#6 0x0000000000425aFF in pkt_CTS_handler ()
#? 0x000000000041Fb52 in MPIDI_CH3I_Progress ()
#8 0x0000000000405Bc1 in MPIR_Wait_impl ()
#9 0x000000000040594e in PMPI_Wait ()
#10 0x0000000000402ea5 in main (argc=1,argv=0x7fffffffe4a8)
at ./simpletest.c:26

0 个答案:

没有答案