Question

我有一个并行区域，用于监控进度。这意味着我使用变量iteration来计算循环的当前状态（百分比：0到100，直到循环结束）。

为此，我以atomic操作递增。有没有办法缩短代码，可能是将iteration++包含在#pragma omp parallel for 子句中？

  int iteration = 0;
#pragma omp parallel for 
  for (int64_t ip = 0; ip < num_voxels; ip++)
  {
    // calc stuff
#pragma omp atomic
    iteration++;
    // output stuff
    // if thread == 0:
    // Progress(iteration / num_voxels * 100);
  }

Answer 1

I don't think it's possible to increment iteration elsewhere than inside the loop body. For instance, this is not allowed:

std::atomic<int> iteration{0};
#pragma omp parallel for 
for (int64_t ip = 0; ip < num_voxels; ip++, iteration++) { ...

since OpenMP requires so-called Canonical Loop Form where the increment expression may not update multiple variables (see Section 2.6 of OpenMP 4.5 Spcification).

Also I would strongly advise against incrementing iteration within each loop, since it would be very inefficient (atomic memory operations = memory fences and cache contention).

I would prefer, e.g.:

int64_t iteration = 0;
int64_t local_iteration = 0;
#pragma omp parallel for firstprivate(local_iteration) 
for (int64_t ip = 0; ip < num_voxels; ip++) {
{
   ... // calc stuff      
   if (++local_iteration % 1024 == 0) { // modulo using bitwise AND
     #pragma omp atomic
     iteration += 1024;
   }
   // output stuff
   // if thread == 0:
   // Progress(iteration / num_voxels * 100);
}

And, output only if progress in percents changes. This might be also tricky, since you need to read iteration atomically and you likely don't want to do that in each iteration. A possible solution, which also saves a lot of cycles regarding "expensive" integer division:

int64_t iteration = 0;
int64_t local_iteration = 0;
int64_t last_progress = 0;
#pragma omp parallel for firstprivate(local_iteration) 
for (int64_t ip = 0; ip < num_voxels; ip++) {
{
   ... // calc stuff      
   if (++local_iteration % 1024 == 0) { // modulo using bitwise AND
      #pragma omp atomic
      iteration += 1024;

      // output stuff:
      if (omp_get_thread_num() == 0) {
         int64_t progress;
         #pragma omp atomic read
         progress = iteration;
         progress = progress / num_voxels * 100;
         if (progress != last_prgoress) {
            Progress(progress);
            last_progress = progress;
         }
      }
   }
}

OpenMP：将原子部分包含在并行区域声明中

1 个答案: