Question

一个简单的例子：

module parameters
  implicit none
  integer :: i,step
  integer :: nt=5
  integer :: nelectron=5
  integer :: num_threads=2
  real(8) :: vt=855555.0
  real(8) :: dt=1.d-5
  real(8) :: vx1_old,vy1_old,vz1_old,t1,t2,x_old,y_old
  real(8) :: x_length=0.0
  real(8) :: y_length=0.0
  real(8) :: vx1_new,vy1_new,vz1_new,vzstore,x_new,y_new
end module parameters
program main
  use parameters
  use omp_lib
  implicit none
  integer :: thread_num

  !$ call omp_set_num_threads(num_threads)
  !$ call omp_set_nested(.false.)

  call cpu_time(t1)

  !$omp parallel
  !$omp& default(private) shared(x_length,y_length)
  !$omp& schedule(static,chunk)
  !$omp& reduction(+:x_length,y_length)
  !$omp do

  do i=1,nelectron

     do step=1,nt

        if(step==1)then           
           vx1_new=1.0
           vy1_new=1.0
           vz1_new=1.0
           x_new=1.0
           y_new=1.0 
        endif

        thread_num=omp_get_thread_num()
        write(*,*)"thread_num",thread_num
        write(*,*)"i",i
        write(*,*)"step",step
        write(*,*) 

        vx1_old=vx1_new
        vy1_old=vy1_new
        vz1_old=vz1_new
        x_old=x_new
        y_old=y_new

        x_length=x_length+x_old
        y_length=y_length+y_old
     enddo       
  enddo
  !$omp end do
  !$omp end parallel
  call cpu_time(t2)
  write(*,*)"x length=",x_length
  write(*,*)"y length=",y_length 
end program main

当我输出使用i和step进行实际工作的主题时，我看到奇怪的地方：

正如您所看到的，线程0正在执行i=6,step=1而线程1正在执行i=6,step=2。为什么在执行相同的i时它更改了线程？我怎样才能避免这种情况。每个i的含义，内循环step在同一个线程上完成。

Answer 1

在OpenMP中，只有最外层循环是并行化的，除非使用collapse子句。这意味着内循环的整个迭代：

 do step=1,nt

    if(step==1)then           
       vx1_new=1.0
       vy1_new=1.0
       vz1_new=1.0
       x_new=1.0
       y_new=1.0 
    endif

    thread_num=omp_get_thread_num()
    write(*,*)"thread_num",thread_num
    write(*,*)"i",i
    write(*,*)"step",step
    write(*,*) 

    vx1_old=vx1_new
    vy1_old=vy1_new
    vz1_old=vz1_new
    x_old=x_new
    y_old=y_new

    x_length=x_length+x_old
    y_length=y_length+y_old
 enddo

由单个线程按顺序使用常量i完成。该线程的值为i，并且我在上面引用了i的值而不与相邻线程进行任何交互。

但是，正如我在评论中指出的那样，您的OpenMP指令的语法是错误的。在这种特定情况下，它会导致竞争条件，而不是x_length和y_length的正确缩减。 AFAIK不会导致您怀疑存在的问题。

你应该做

  !$omp parallel &
  !$omp& default(private)  &
  !$omp& shared(x_length,y_length)

  !$omp do schedule(static,5) reduction(+:x_length,y_length)

或者只是避免棘手的续行

  !$omp parallel default(private) shared(x_length,y_length)

  !$omp do schedule(static,5) reduction(+:x_length,y_length)

正如我在上面评论的那样，不要信任并行程序的输出，除非你以某种方式处理顺序，并且永远不要将cpu_time()用于并行程序。

嵌套循环，如何在外部循环上并行执行，同时在内部循环中执行

1 个答案: