Fortran MPI allgatherv,派生类型为2d数组

时间:2017-03-20 22:57:01

标签: fortran 2d mpi derived-types

需要有关此Fortran MPI问题的帮助。尝试从2D阵列的不同列收集数据。问题是不使用每行的所有数据,并且每个进程分配的列不相等。所有进程都以等效的全局数据视图开始,每个进程应对特定列执行工作,最后交换信息,以便所有进程再次共享公共视图。问题类似于MPI partition and gather 2D array in FortranSending 2D arrays in Fortran with MPI_Gather

绘制示例:使用3 MPI过程的数据(8,4)

---------------------
| a1 | b1 | c1 | d1 |
| a2 | b2 | c2 | d2 |
| a3 | b3 | c3 | d3 |
| a4 | b4 | c4 | d4 |
| a5 | b5 | c5 | d5 |
| a6 | b6 | c6 | d6 |
| a7 | b7 | c7 | d7 |
| a8 | b8 | c8 | d8 |
---------------------

进程1将获得2列工作,进程2获得1列,进程3获得1列。

-----------  ------  ------
| a1 | b1 |  | c1 |  | d1 |
| a2 | b2 |  | c2 |  | d2 |
| a3 | b3 |  | c3 |  | d3 |
| a4 | b4 |  | c4 |  | d4 |
| a5 | b5 |  | c5 |  | d5 |
| a6 | b6 |  | c6 |  | d6 |
| a7 | b7 |  | c7 |  | d7 |
| a8 | b8 |  | c8 |  | d8 |
-----------  ------  ------

在实际问题中,实际大小是数据(200000,59)。这是一个预分配的内存块,我只使用每列的一部分(始终从索引1开始)。例如,我只需要每列中的前3个值。

-----------  ------  ------
| a1 | b1 |  | c1 |  | d1 |
| a2 | b2 |  | c2 |  | d2 |
| a3 | b3 |  | c3 |  | d3 |
| == | == |  | == |  | == |
| a4 | b4 |  | c4 |  | d4 |
| a5 | b5 |  | c5 |  | d5 |
| a6 | b6 |  | c6 |  | d6 |
| a7 | b7 |  | c7 |  | d7 |
| a8 | b8 |  | c8 |  | d8 |
-----------  ------  ------

我正在尝试创建可用于完成此操作的发送和接收数据类型。到目前为止,我最好的猜测是使用MPI_TYPE_VECTOR。 MPI_TYPE_VECTOR(COUNT,BLOCKLENGTH,STRIDE,OLDTYPE,NEWTYPE,IERROR)

为此,将使用MPI_TYPE_VECTOR(1,3,8,MPI_DOUBLE,newtype,ierr)。这应该允许每个进程发送最少量的信息。有了这个,我想我应该能够用ALLGATHERV发送信息。

MPI_ALLGATHERV(SENDBUF,SENDCOUNT,SENDTYPE,RECVBUF,RECVCOUNT,DISPLS,RECVTYPE,COMM,IERROR) 我在哪里使用MPI_ALLGATHERV(data(1,my_first_col),num_cols_to_be_sent,newtype,data,RECVCOUNT [],DISPLS [],newtype,COMM,IERROR)

据我所知,这是应该为每个进程发送的信息。

Process 1: [a1,a2,a3,b1,b2,b3]
Process 2: [c1,c2,c3]
Process 3: [d1,d2,d3]

我看到的例子都使用整列数据或者位移自然是所需子阵列的倍数。我无法将其解压缩到正确的列中。由于接收端对类型的大小/范围有所了解,因此无法做到这一点。当然,我在整个程度上都很困惑。任何帮助,将不胜感激。真正的代码正在运行,但这里是一个快速重新查看和评论(可能无法编译,只是快速制作)。

  MODULE PARALLEL
    INTEGER iproc, nproc, rank, ierr
    INTEGER mylow, myhigh, mysize, ichunk, irem
    INTEGER, ALLOCATABLE :: isize(:), idisp(:), ilow(:), ihigh(:)
    DOUBLE PRECISION, ALLOCATABLE :: glob_val(:,:)
    INTEGER newtype
  END MODULE


  PROGRAM MAIN
  USE PARALLEL
  IMPLICIT NONE
  INCLUDE 'mpif.f'

c   **temp variables
  integer i, j
  integer num_rows,num_cols
  integer used_rows

c    ----setup MPI----
  call MPI_INIT(ierr)
  call MPI_COMM_RANK(MPI_COMM_WORLD,rank,ierr)
  call MPI_COMM_SIZE(MPI_COMM_WORLD,nproc,ierr)
  iproc = rank+1  !rank is base 0, rest of fortran base 1

c   ----setup initial data      
  integer num_rows=20  !contiguous in memory over rows (ie single column)
  integer num_cols=11  !noncontiguous memory between different columns
  integer 

  ALLOCATE (isize(nproc))
  ALLOCATE (idisp(nproc))
  ALLOCATE (ilow(nproc))
  ALLOCATE (ishigh(nproc))      
  ALLOCATE (glob_val(num_rows,num_cols))

  glob_val = 1.0*iproc   !sent all glob values to process id
  do i=1,num_cols
    do j=1,used_rows
      glob_val(j,i) = iproc+.01*j  !add refernce index to used data
    end do
  end do

c   ---setup exchange information
  ichunk = num_cols/nproc
  irem = num_cols -(ichunk*nproc)
  mysize=ichunk
  if(iproc.le.irem) mysize=mysize+1

  mylow=0
  myhigh=0

  do i=1,nproc   !establish global understanding of processes
    mylow=myhigh+1
    myhigh=mylow+ichunk
    if(i.le.irem) myhigh=myhigh+1

    isize(i)=myhigh-mylow+1
    idisp(i)=(mylow-1)    !based on receiving type size/extent
    ilow(i)=mylow
    ihigh(i)=myhigh
  end do
  mylow=ilow(iproc)
  myhigh=ihigh(iproc)

  call MPI_TYPE_VECTOR(1,used_rows,num_rows,MPI_DOUBLE,
 &                     newtype,ierr)
  call MPI_TYPE_COMMIT(newtype,ierr)

c   --- perform exchange based on 'newtype'      
      !MPI_ALLGATHERV(SENDBUF, SENDCOUNT, SENDTYPE,
      !               RECVBUF, RECVCOUNT, DISPLS, RECVTYPE,
      !               COMM, IERROR)
  call MPI_ALLGATHERV(glob_val(1,mylow),mysize,newtype
 &                    glob_val,isize,iproc,newtype,
 &                    MPI_COMM_WORLD,ierr)      

c   ---print out global results of process 2
  if(iproc.eq.2) then      
    do i=1,num_rows
      write(*,*) (glob_val(i,j),j=1,num_cols) 
    end do
  end if

  END program

1 个答案:

答案 0 :(得分:0)

好的,我按照以下方式工作:

1)myhigh=mylow + ichunk - 1不是myhigh = mylow + ichunk

2)used_rows必须在赋值循环之前设置

3)更明确地定义实际缓冲区,尝试

call MPI_ALLGATHERV(glob_val(:,mylow:myhigh), mysize, newtype,   &
                    glob_val(1:used_rows,:), isize, idisp, newtype, &
                    MPI_COMM_WORLD, ierr)

使用gfortran和openmpi的完整代码:

  MODULE PARALLEL
    INTEGER iproc, nproc, rank, ierr
    INTEGER mylow, myhigh, mysize, ichunk, irem
    INTEGER, ALLOCATABLE :: isize(:), idisp(:), ilow(:), ihigh(:)
    DOUBLE PRECISION, ALLOCATABLE :: glob_val(:,:)
    INTEGER newtype
  END MODULE


  PROGRAM MAIN
  USE PARALLEL
  use mpi
  IMPLICIT NONE
  ! INCLUDE 'mpif.f'

!   **temp variables
  integer i, j
  integer num_rows,num_cols
  integer used_rows

!    ----setup MPI----
  call MPI_INIT(ierr)
  call MPI_COMM_RANK(MPI_COMM_WORLD,rank,ierr)
  call MPI_COMM_SIZE(MPI_COMM_WORLD,nproc,ierr)
  iproc = rank+1  !rank is base 0, rest of fortran base 1

!   ----setup initial data      
  num_rows=8  !contiguous in memory over rows (ie single column)
  num_cols=4  !noncontiguous memory between different columns
  used_rows = 3

  ALLOCATE (isize(nproc))
  ALLOCATE (idisp(nproc))
  ALLOCATE (ilow(nproc))
  ALLOCATE (ihigh(nproc))      
  ALLOCATE (glob_val(num_rows,num_cols))

!  glob_val = 1.0*iproc   !sent all glob values to process id
  glob_val = -1.0 * iproc  
  do i=1,num_cols
    do j=1,used_rows
      glob_val(j,i) = (1.0*iproc)+(.01*j)  !add refernce index to used data
    end do
  end do

!   ---setup exchange information
  ichunk = num_cols/nproc
  irem = num_cols -(ichunk*nproc)
  mysize=ichunk
  if(iproc.le.irem) mysize=mysize+1

  mylow=0
  myhigh=0

  do i=1,nproc   !establish global understanding of processes
    mylow=myhigh+1
    myhigh=mylow+ichunk-1
    if(i.le.irem) myhigh=myhigh+1

    isize(i)=myhigh-mylow+1
    idisp(i)=(mylow-1)    !based on receiving type size/extent
    ilow(i)=mylow
    ihigh(i)=myhigh
  end do
  mylow=ilow(iproc)
  myhigh=ihigh(iproc)

  call MPI_TYPE_VECTOR(1,used_rows,num_rows,MPI_DOUBLE, &
                      newtype,ierr)
  call MPI_TYPE_COMMIT(newtype,ierr)

  write(*,*) rank, idisp
  write(*,*) rank, isize
!   --- perform exchange based on 'newtype'      
      !MPI_ALLGATHERV(SENDBUF, SENDCOUNT, SENDTYPE,
      !               RECVBUF, RECVCOUNT, DISPLS, RECVTYPE,
      !               COMM, IERROR)
  call MPI_ALLGATHERV(glob_val(:,mylow:myhigh),mysize,newtype, &
                     glob_val(1:used_rows,:),isize,idisp,newtype, &
                     MPI_COMM_WORLD,ierr)      

!   ---print out global results of process 2
  if(iproc.eq.2) then      
    do i=1,num_rows
      write(*,*) (glob_val(i,j),j=1,num_cols) 
    end do
  end if

  call MPI_Finalize(ierr)

  END program