我有程序,我正在使用MPI进行并行化。我附加了测试代码,它将数组分解为沿着给定数量的处理器的多个分段,然后我使用广播来组合来自所有进程的结果并将其分配给所有进程。当所有处理器的分发不相同并导致广播失败时,会出现此问题。我附上了以下代码,它适用于6个进程,但不适用于8个进程。
program main
include 'mpif.h'
integer i, rank, nprocs, count, start, stop, nloops
integer err, n1
REAL srt, sp
REAL b(1,1:10242)
REAL c(1,1:10242)
integer, Allocatable :: jjsta(:), jjlen(:)
allocate (jjsta(0:nprocs-1),jjlen(0:nprocs-1))
call MPI_Init(ierr)
call MPI_Comm_rank(MPI_COMM_WORLD, rank, ierr)
call MPI_Comm_size(MPI_COMM_WORLD, nprocs, ierr)
n1 = 10242
call para_range(1,n1,nprocs,rank,start,stop)
jjsta(rank) = start
jjlen(rank) = stop -start +1
srt = MPI_WTIME()
do i=start,stop
c(1,i) = i
enddo
sp =MPI_WTIME()
print *,"Process ",rank," performed ", jjlen(rank), &
" iterations of the loop."
! CALL MPI_ALLGATHERV(c(1,jjsta(rank)), jjlen(rank), MPI_REAL,&
! b,jjlen(rank),jjsta(rank)+jjlen(rank)+1,MPI_REAL,MPI_COMM_WORLD,ierr)
CALL MPI_BCAST(c(1,jjsta(rank)), jjlen(rank), MPI_REAL,&
rank, MPI_COMM_WORLD, ierr)
call MPI_Finalize(ierr)
deallocate(jjsta,jjlen)
end program main
subroutine para_range(n1, n2, nprocs, irank, ista, iend)
integer(4) :: n1 ! Lowest value of iteration variable
integer(4) :: n2 ! Highest value of iteration variable
integer(4) :: nprocs ! # cores
integer(4) :: irank ! Iproc (rank)
integer(4) :: ista ! Start of iterations for rank iproc
integer(4) :: iend ! End of iterations for rank iproc
iwork1 = ( n2 -n1 + 1 ) / nprocs
! print *, iwork1
iwork2 = MOD(n2 -n1 + 1, nprocs)
! print *, iwork2
ista = irank* iwork1 + n1 + MIN(irank, iwork2)
iend = ista + iwork1 -1
if (iwork2 > irank) iend = iend + 1
return
end subroutine para_range
当我使用8个处理器运行它时,MPI_Bcast已经注释掉了,这就是它打印出来并且它可以正常工作
Process 2 performed 1280 iterations of the loop.
Process 1 performed 1281 iterations of the loop.
Process 4 performed 1280 iterations of the loop.
Process 5 performed 1280 iterations of the loop.
Process 3 performed 1280 iterations of the loop.
Process 6 performed 1280 iterations of the loop.
Process 0 performed 1281 iterations of the loop.
Process 7 performed 1280 iterations of the loop.
但是当我尝试将所有结果和广播结合起来时,这就是我得到的错误
Fatal error in MPI_Bcast:
Message truncated, error stack:
MPI_Bcast(1128)...................: MPI_Bcast(buf=0x2af182d8d028, count=1280, MPI_REAL, root=4, MPI_COMM_WORLD) failed
knomial_2level_Bcast(1268)........:
MPIDI_CH3U_Receive_data_found(259): Message from rank 0 and tag 2 truncated; 5124 bytes received but buffer size is 5120
Process 4 performed 1280 iterations of the loop.
我还尝试了MPI_Allgatherv将结果合并到数组b中,我得到以下错误:
Fatal error in MPI_Allgatherv:
Message truncated, error stack:
MPI_Allgatherv(1050)................: MPI_Allgatherv(sbuf=0x635df8, scount=1280, MPI_REAL, rbuf=0x62bde8, rcounts=0x2b3ff297b244, displs=0x7fff3ebc1fb0, MPI_REAL, MPI_COMM_WORLD) failed
MPIR_Allgatherv(243)................:
MPIC_Sendrecv(117)..................:
MPIDI_CH3U_Request_unpack_uebuf(631): Message truncated; 5124 bytes received but buffer size is 5120
将数组合并到我时的问题似乎是它没有更新正在发送的数组的长度。我似乎无法弄清楚如何解决问题,任何帮助将不胜感激。谢谢。