Question

我正在尝试学习OpenMP的概念，偶然发现了一个案例，我很难掌握如何解决使用这个库的问题。

假设我们有以下递归函数

// ...
void recurse(int tmp[], int p, const int size)
{
   if (p == size)
   {
      // Computationally heavy, should be executed in its own "thread"
      performTask(tmp); // Note: Only requires read access
   }
   else
   {
      for(int i = 0; i < size; i++)
      {
         // Alter tmp and continue recursion
         tmp[p] = i;
         recurse(tmp, p+1, size);
      }
   }
}
// ...
int main(int argc, char * argv[])
{
    int tmp[10];
    recurse(tmp, 0, 10);
    return 0;
}

如何在使用OpenMP在主线程中生成新结构时并行执行performTask？

我知道有一些叫做“任务”的东西，我认为这就是我应该在这里使用的东西，但我想出的所有东西都没有获得任何性能提升。请指出我正确的方向。

编辑：我使示例程序更具体，以便更好地解释。

Answer 1

下面的代码不能正常工作，但希望它会指出正确的方向：

// ...
void recurse(int tmp[], int p, const int size)
{
   if (p == size)
   {
      // Computationally heavy, should be executed in its own "thread"
      // perform task using the thread pool
#pragma omp task     
      performTask(tmp); // Note: Only requires read access
   }
   else
   {
      for(int i = 0; i < size; i++)
      {
         // Alter tmp and continue recursion
         tmp[p] = i;
         recurse(tmp, p+1, size);
      }
   }
}
// ...
int main(int argc, char * argv[])
{    
    int tmp[10];
    // start threads
#pragma omp parallel
{
    // use single thread to construct `tmp` values
#pragma omp single nowait
    recurse(tmp, 0, 10);
}
    return 0;
}

代码基于Comparing Nested Parallel Regions and Tasking in OpenMP 3.0。

使用OpenMP并行化基本案例计算以进行递归

1 个答案: