Question

因此在我的代码中有各种函数可以改变各种数组，并且调用函数的顺序很重要。由于所有函数被调用很多次创建和销毁线程已经成为一个很大的开销。编辑我的问题，因为我可能已经过度简化了我当前的问题。一个例子

double ans = 0;
for (int i = 0; i < 4000; i++){
    funcA(a,b,c);
    funcB(a,b,c);
    ans = funcC(a,b,c):
}
prinft(ans);

其中funcA，funcB和func C是

void funcA (int* a, point b, int* c){
#pragma omp parallel for shared(a,b,c)
    for (int ii = 0; ii < b.y; ii++){
        for (int jj = 0; jj < b.x; jj++){
          \\ alter values of a and c
        }
    }
}

void funcB (int* a, point b, int* c){
#pragma omp parallel for shared(a,b,c)
    for (int ii = 0; ii < b.y; ii++){
        for (int jj = 0; jj < b.x; jj++){
          \\ alter values of a and c
        }
    }
}

double funcC (int* a, pointb, int* c){
    double k = 0;
#pragma omp parallel for shared(a,b,c) reduction(+:k)
    for (int ii = 0; ii < b.y; ii++){
        for (int jj = 0; jj < b.x; jj++){
          \\ alter values of a and c
            k += sqrt(a[ii*jj] + c[ii**jj]);
        }
    }
    return k;
}

有没有办法在所有函数使用的初始for循环之前创建一个线程组，并且不会经常被销毁和再次创建，并且仍然在函数调用中保持正确的顺序？

编辑2：

我正在寻找的是一种按顺序运行funcA funB，funcC的方法。但是函数里面有一些代码可以使用多个线程。我想要一种在开始时创建线程的方法，然后它们将仅用于那些并行部分，因此最后的答案是正确的。有没有办法避免分叉和加入40000次？

Answer 1

假设您的其余代码是正确的，以下内容应该按照您希望的方式运行：

#pragma omp parallel shared( a, b, c )
for (int i = 0; i < 4000; i++){
    funcA(a,b,c);
    funcB(a,b,c);
    funcC(a,b,c):
}

现在定义的不同功能如下：

void funcA( int* a, point b, int* c ) {
    #pragma omp for schedule( static )
    for (int ii = 0; ii < b.y; ii++) {
        for (int jj = 0; jj < b.x; jj++) {
          \\ alter values of a and c
        }
    }
}

void funcB( int* a, point b, int* c ) {
    #pragma omp for schedule( static )
    for (int ii = 0; ii < b.y; ii++) {
        for (int jj = 0; jj < b.x; jj++) {
          \\ alter values of a and c
        }
    }
}

void funcC( int* a, point b, int* c ) {
    #pragma omp for schedule( static )
    for (int ii = 0; ii < b.y; ii++) {
        for (int jj = 0; jj < b.x; jj++) {
          \\ alter values of a and c
        }
    }
}

函数内部的这些OpenMP指令称为孤立指令，因为它们出现在任何OpenMP并行区域之外的代码中。但是，在运行时，任何预先存在的OpenMP线程团队都会按照您希望的方式使用它们。

此外，我为每个循环添加了schedule( static )子句。这对于代码正确性来说不是必需的，但是这个可以通过确保每个线程总是在函数和调用之间处理相同的索引来提高性能......

避免线程创建openMP的开销

1 个答案: