OpenMP PARALLEL DO中的子程序 - 程序崩溃

时间:2016-06-01 15:11:48

标签: parallel-processing fortran openmp subroutine

我已阅读Calling an internal subroutine inside OpenMP regionGlobal Variables in Fortran OpenMP。我的理解(from here)是:

  • 参数列表中的变量从调用例程继承其数据范围属性。
  • Fortran中的COMMON块或模块变量是共享的,除非声明为THREADPRIVATE。
  • Fortran中的SAVE变量是共享的。
  • 所有其他本地变量都是私有的。

以下是我的代码的简化版本:

!$OMP PARALLEL DO DEFAULT(SHARED) PRIVATE(j,dummy1,dummy2,dummy3,dummy4)
DO j=1,ntotal
  dummy1 = 0.0d0
  dummy2 = foo(j)
  CALL kernel(dummy1,dummy1,dummy2,dummy3,dummy4)  
  Variable(j) = dummy3 + dummy4
END DO 
!$OMP END PARALLEL DO 

子程序内核然后接受IN dummy1和dummy2并输出OUT dummy3和dummy4。我编译:

 -fopenmp -fno-automatic -fcheck=all

我明白了:

Fortran runtime error: Recursive call to nonrecursive procedure 'kernel'

我理解的是here中的哪一个。当我在没有-fcheck的情况下进行编译时,有时代码会在没有发生事故的情况下传递子程序调用,但大多数时候它会崩溃并且没有错误。我猜这是因为我的子程序不是线程安全的。传递给子例程的所有参数对于每个线程都应该是私有的和个体的。修剪后的子程序如下:

SUBROUTINE kernel(r,dx,hsml,w,dwdx)   

  USE Initial_Parameters
  IMPLICIT NONE 

  ! DATA DICTIONARY: DECLARE CALLING PARAMETER TYPES AND DEFINITIONS
  REAL(KIND=dp), INTENT(IN)                 ::  r           
  REAL(KIND=dp), DIMENSION(dim), INTENT(IN) ::  dx          
  REAL(KIND=dp), INTENT(IN)                 ::  hsml        
  REAL(KIND=dp), INTENT(OUT)                ::  w           
  REAL(KIND=dp), DIMENSION(dim), INTENT(OUT)::  dwdx        
  ! DATA DICTIONARY: DECLARE LOCAL VARIABLE TYPES AND DEFINITIONS
  INTEGER                                   ::  i, d  
  REAL(KIND=dp)                             ::  q, dw
  REAL(KIND=dp)                             ::  factor      


  ! Kernel functions are funcitons of q, the distance between particles
  ! divided by the smoothing length  
  q = r/hsml 
  ! Preset the kernel to zero
  w = 0.e0
  ! Preset the derivative of the kernel to zero
  DO d=1,dim         
    dwdx(d) = 0.e0
  END DO   

  IF (skf == 1) THEN     

    ! If the problem is one dimensional then,
    IF (dim == 1) THEN
      ! The coefficient, alpha = factor is given by:
      factor = 1.e0/hsml
    ! If the problem is two dimensional then,
    ELSE IF (dim == 2) THEN
      ! The coefficient, alpha = factor is given by:
      factor = 15.e0/(7.e0*pi*hsml*hsml)
    ! If the problem is two dimensional then,
    ELSE IF (dim == 3) THEN
      ! The coefficient, alpha = factor is given by:
      factor = 3.e0/(2.e0*pi*hsml*hsml*hsml)
    ! If the dimension value is not 1, 2 or 3 then there is a problem.
    ELSE
       WRITE(*,*)' >>> Error <<< : Wrong dimension: Dim =',dim
       STOP
    END IF

    ! Smoothing function for 1st range of q.                                         
    IF (q >= 0 .AND. q <= 1.e0) THEN
      ! The smoothing function is given by:
      w = factor * (2./3. - q*q + q*q*q / 2.)
      ! For each dimension work out the gradient of the smoothing function
      DO d = 1, dim
        dwdx(d) = factor * (-2.+3./2.*q)/hsml**2 * dx(d)       
      END DO   

    ! Smoothing function for 2nd range of q.  
    ELSE IF (q > 1.e0 .AND. q <= 2) THEN  
      ! Smoothing function is equal to:        
      w = factor * 1.e0/6.e0 * (2.-q)**3 
      ! Gadient of the smoothing function in each dimension.
      DO d = 1, dim
        dwdx(d) =-factor * 1.e0/6.e0 * 3.*(2.-q)**2/hsml * (dx(d)/r)        
      END DO   

    ! Smoothing function and gradient for all other values of q is zero.
    ELSE
      ! Smoothing function is equal to: 
      w=0.
      ! Gadient of the smoothing function in each dimension.
      DO d= 1, dim
        dwdx(d) = 0.
      END DO             
    END IF     

END SUBROUTINE kernel

局部变量应该是私有的,并且所有已传递的参数都是私有的。模块参数是共享的,但这很好。你能解释一下为什么会崩溃吗?

1 个答案:

答案 0 :(得分:3)

使用-fno-automatic kernel中的局部变量将隐式SAVE d。说明here说明

  

在从并行区域调用的过程中声明的具有SAVE属性的局部变量是隐式共享的。

因此,kernel确实不是线程安全的(据我所知)。

另请注意,在您的示例中,您将dummy1作为kernel的第一个和第二个参数传递,但是您对此例程的定义指定了标量中的第一个参数(r),而second(dx)是一个长度为dim的数组。我不确定这只是您的最小示例或真实代码的产物,但这可能会导致问题。您是否在模块内声明kernel然后使用该模块?这将生成有助于捕获此类事物的接口。