将threadprivate数组减少为共享数组

时间:2014-09-17 08:05:56

标签: c openmp

我需要在threadprivate数组上执行几个操作,然后将这些内容汇总到一个全局共享数组中。我曾尝试过两种方式,第一种方式是atomic指令:

#define N 100
int i;
#pragma omp threadprivate(i)
double local_x[N],local_y[N],local_z[N],x[N],y[N],z[N];
#pragma omp threadprivate(local_x,local_y,local_z)

int main(){

  for(i=0;i<N;i++) x[i]=y[i]=z[i]=0.;

  eval_local_xyz(); // the content of local_x,local_y,local_z is now changed

  // now we want to collect the local arrays into the global ones
  #pragma omp parallel
  {
    for(i=0;i<N;i++){
      #pragma omp atomic
      x[i]+=local_x[i];
      #pragma omp atomic
      y[i]+=local_y[i];
      #pragma omp atomic
      z[i]+=local_z[i];
    }
  }
}

和另一个critical

#define N 100
int i;
#pragma omp threadprivate(i)
double local_x[N],local_y[N],local_z[N],x[N],y[N],z[N];
#pragma omp threadprivate(local_x,local_y,local_z)

int main(){

  for(i=0;i<N;i++) x[i]=y[i]=z[i]=0.;

  eval_local_xyz(); // the content of local_x,local_y,local_z is now changed

  // now we want to collect the local arrays into the global ones
  #pragma omp parallel
  {
    #pragma omp critical (sumx)
    for(i=0;i<N;i++) x[i]+=local_x[i];
    #pragma omp critical (sumy)
    for(i=0;i<N;i++) y[i]+=local_y[i];
    #pragma omp critical (sumz)
    for(i=0;i<N;i++) z[i]+=local_z[i];
  }
}

对于大N,第二种方法看起来比第一种方法快。但是我从两种方法得到的结果略有不同。问题是:这两种方法是否应该产生相同的结果?

0 个答案:

没有答案