OpenMP和内部子例程

时间:2014-11-07 12:40:27

标签: fortran openmp

我一直在测试简单Fortran代码的OpenMP加速,1)并行区域内部子程序调用和2)内部子程序内部的并行区域初始化。在这两种情况下,openmp do循环都放在内部子程序中。这是一个简单的代码:

module module_with_subroutine

include "omp_lib.h"

integer ,parameter  :: rkind = selected_real_kind(15,307)
real(rkind)     ,dimension(100,100,100)  :: A    
real(rkind) :: elapsed_time
integer     :: clock_start ,clock_end ,clock_rate

contains

subroutine module_subprogram

A = 3.14_rkind

call SYSTEM_CLOCK(count_rate = clock_rate)
call SYSTEM_CLOCK(count = clock_start)

!!$omp parallel num_threads(12)                         !!! For case 1

call intrinsic_subprogram

!!$omp end parallel

call SYSTEM_CLOCK(count = clock_end)

elapsed_time = (clock_end-clock_start)/real(clock_rate,rkind)

print *, 'Elapsed time in seconds: ',elapsed_time
print *, 'Clock start: ',clock_start
print *, 'Clock end: ',clock_end
print *, 'Clock rate: ',clock_rate

contains

    subroutine intrinsic_subprogram
        integer :: i, j, k, steps ,nthread

        do steps = 1,10000
            !$omp parallel num_threads(12)               !!! For case 2
            !$omp do collapse(3) private (i,j,k,nthread)
            do k = 1,100
            do j = 1,100
            do i = 1,100
                nthread = omp_get_thread_num()
                A(i,j,k) = (exp(A(i,j,k)**3.14 + sqrt(A(i,j,k)**3.14) + log(A(i,j,k)**3.14)))**1.414
                if(A(i,j,k) <= 3.14) then
                   A(i,j,k) = 1.0
                end if 
                !print *, 'thread number',nthread
                !print *, 'i,j,k',i,j,k
            end do
            end do
            end do
            !$omp end do
            !$omp end parallel
        end do

    end subroutine

end subroutine

end module

奇怪的是,第二种情况总是比第一种情况快一点,尽管在步骤循环内有多个并行区域初始化。也许有人可以解释这种行为?我是OpenMP编程的新手,也许我误解了OpenMP线程分叉技术。提前谢谢!

0 个答案:

没有答案