具有不同大小数据类型的MPI通信

时间:2016-05-30 13:12:07

标签: arrays multidimensional-array fortran mpi

假设程序在xpypzp次进程中运行。 使用笛卡尔通信器使得可以认为过程被布置在维度网格(xp,yp,zp)中。 在此程序中,根进程(0)声明并分配一个3D数组Atot,该数组将由每个进程声明的3D数组A填充(包括根)。

INTEGER, DIMENSION(3) :: Ntot
INTEGER, DIMENSION(3) :: N
INTEGER, DIMENSION(:,:,:), ALLOCATABLE :: Atot
INTEGER, DIMENSION(:,:,:), ALLOCATABLE :: A
:
! the 3 elements of the array N are determined by dividing the corresponding
! element of the array Ntot by the number of process in that direction
! taking into account the reminder of the division.
:
IF (myid == 0) THEN ! myid is the process' rank
  ALLOCATE(Atot(Ntot(1),Ntot(2),Ntot(3))
END IF
ALLOCATE(A(N(1),N(2),N(3))
A = myid

哪种方式最正确,最简单有效地进行沟通? 我在考虑MPI_gather:每个进程都会发送由A N(1)*N(2)*N(3)组成的整个数组MPI_INTEGER,然后根进程应该将它们接收到一个对应于多维数据集的MPI派生数据类型(MPI_type_vector应该递归使用两次,对不对?)。 可以这样做吗?

即使这样可行,当笛卡尔通信器的每个方向上的进程数均匀地划分Ntot的相应元素时,也就是说,当数组A具有每个过程都有相同的尺寸。这是Ntot = (/9,9,9/)

的情况

案例Ntot = (/10,10,10/)怎么样? mpi派生的数据类型在不同的进程中会有不同的维度,那么仍然可以使用MPI_ghather吗?

修改

我不排除MPI_GATHERV可能是解决方案的一部分。但是,它允许每个进程发送(和根进程接收)不同数量的数据,即不同数量的MPI_INTEGERS(在简单示例中)。但是,在我处理的情况下,根进程必须接收三维数组Atot中的数据。为此,我认为定义MPI派生数据类型可能很有用,我们将其命名为smallcube。在这种情况下,每个进程发送整个数组A,而主进程将从每个进程接收1个类型smallcube的数据。关键是small cube沿着三个维度具有不同的长度,这取决于它在笛卡尔网格中的位置(假设长度不均匀地除以沿三维的过程数)。

1 个答案:

答案 0 :(得分:3)

正如评论中所提到的,如果您确实想要将所有数据提取到一个处理器上,那么MPI_Type_create_subarray可能是一种很好的方法。鉴于我在我自己的项目中只使用了program subarrayTest use mpi implicit none integer, parameter :: n1 = 10, n2=20, n3=32 INTEGER, DIMENSION(3) :: Ntot, N, sizes, subsizes, starts INTEGER, DIMENSION(:,:,:), ALLOCATABLE :: Atot, A integer :: iproc, nproc, sendSubType, ierr integer :: nl1, nl2, nl3 !Local block sizes integer :: l1, l2, l3, u1, u2, u3 !Local upper/lower bounds integer :: ip, sendRequest integer, dimension(:), allocatable :: recvSubTypes, recvRequests integer, dimension(:,:,:), allocatable :: boundsArr !MPI Setup call mpi_init(ierr) call mpi_comm_size(mpi_comm_world, nproc, ierr) call mpi_comm_rank(mpi_comm_world, iproc, ierr) !Set grid sizes Ntot = [n1,n2,n3] !For simplicity I'm assuming we only split the last dimension (and it has nproc as a factor) !although as long as you can specify l* and u* this should work (and hence nl* = 1+u*-l*) if(mod(n3,nproc).ne.0) then print*,"Error: n3 must have nproc as a factor." call mpi_abort(mpi_comm_world,MPI_ERR_UNKNOWN,ierr) endif nl1 = n1 ; l1 = 1 ; u1=l1+nl1-1 nl2 = n2 ; l2 = 1 ; u2=l2+nl2-1 nl3 = n3/nproc ; l3 = 1+iproc*nl3 ; u3=l3+nl3-1 N = [nl1,nl2,nl3] !Very lazy way to ensure proc 0 knows the upper and lower bounds for all procs allocate(boundsArr(2,3,0:nproc-1)) boundsArr=0 boundsArr(:,1,iproc) = [l1, u1] boundsArr(:,2,iproc) = [l2, u2] boundsArr(:,3,iproc) = [l3, u3] call mpi_allreduce(MPI_IN_PLACE,boundsArr,size(boundsArr),MPI_INTEGER, & MPI_SUM, mpi_comm_world, ierr) !Allocate and populate local data portion IF (iproc == 0) THEN ! iproc is the process' rank ALLOCATE(Atot(Ntot(1),Ntot(2),Ntot(3))) Atot=-1 !So you can check all elements are set END IF ALLOCATE(A(N(1),N(2),N(3))) A = iproc !Now lets create the sub array types !First do the send type sizes=N !The size of the local array subsizes=1+[u1,u2,u3]-[l1,l2,l3] !The amount of data in each dimension to send -- here it's the full local data array but in general it could be a small subset starts = [0,0,0] !These are the lower bounds in each dimension where the sub array starts -- Note MPI assumes 0 indexing here. call mpi_type_create_subarray(size(sizes),sizes, subsizes, starts, & MPI_ORDER_FORTRAN, MPI_INTEGER, sendSubType, ierr) call mpi_type_commit(sendSubType, ierr) !Now on proc0 setup each receive type if (iproc == 0) then allocate(recvSubTypes(0:nproc-1)) !Use 0 indexing for ease sizes = Ntot !Size of dest array do ip=0,nproc-1 subsizes=1+boundsArr(2,:,ip)-boundsArr(1,:,ip) !Size of A being sent from proc ip starts = boundsArr(1,:,ip) -1 call mpi_type_create_subarray(size(sizes),sizes, subsizes, starts, & MPI_ORDER_FORTRAN, MPI_INTEGER, recvSubTypes(ip), ierr) call mpi_type_commit(recvSubTypes(ip), ierr) end do end if !Now lets use non-blocking communications to transfer data !First post receives -- tag with source proc id if (iproc == 0) then allocate(recvRequests(0:nproc-1)) do ip=0,nproc-1 call mpi_irecv(Atot,1,recvSubTypes(ip),ip,ip,& mpi_comm_world,recvRequests(ip),ierr) end do end if !Now post sends call mpi_isend(A,1,sendSubType,0,iproc,mpi_comm_world,& sendRequest, ierr) !Now wait on receives/sends if(iproc == 0) call mpi_waitall(size(recvRequests),recvRequests,& MPI_STATUSES_IGNORE,ierr) call mpi_wait(sendRequest, MPI_STATUS_IGNORE, ierr) if(iproc == 0) print*,Atot call mpi_barrier(mpi_comm_world, ierr) !Now free resources -- not shown call mpi_finalize(ierr) end program subarrayTest ,我想我会尝试提供一个有效的示例答案(请注意我的错误检查和我声明的类型是松散的。)

mpif90

您应该可以使用l*进行编译。你需要解决这个问题,以便为你的情况适当地设置局部边界,但希望这将提供一个有用的起点。这并不假设任何关于处理器之间的本地数组大小是相同的,只要正确设置了下限和上限(u* SELECT CASE WHEN :param = 1 THEN 'TEST1' WHEN :param = 2 THEN 'TEST2' WHEN :param = 3 THEN 'TEST3' END AS RESULT FROM DUAL ),那么这应该可以正常工作。请注意,上面的代码可能不会以多种方式遵循最佳实践。