如何让嵌套函数使用OpenMP并行运行?

时间:2017-05-23 09:17:38

标签: multithreading parallel-processing openmp

以下面的代码为例:

void func(void) {
#pragma omp parallel for
    for (int i = 0; i < 4; i++) {
        printf("%s: %d\n", __func__, omp_get_thread_num());
    }
}

int main(void) {
#pragma omp parallel for
    for (int i = 0; i < 2; i++) {
        printf("%s: %d\n", __func__, omp_get_thread_num());
        func();
    }
    return 0;
}

我希望main函数生成2 func个线程,并且在每个func线程中,它将生成另一个3个线程。所以完全会有8个主题。但是运行上面的程序:

$ ./a.out
main: 1
main: 0
func: 0
func: 0
func: 0
func: 0
func: 0
func: 0
func: 0
func: 0

它表示只创建了外部2个线程。我尝试使用collapse

void func(void) {
#pragma omp parallel for
    for (int i = 0; i < 4; i++) {
        printf("%s: %d\n", __func__, omp_get_thread_num());
    }
}

int main(void) {
#pragma omp parallel for collapse(2)
    for (int i = 0; i < 2; i++) {
        printf("%s: %d\n", __func__, omp_get_thread_num());
        func();
    }
    return 0;
}

编译器提出以下抱怨:

parallel.c: In function ‘main’:
parallel.c:15:3: error: not enough perfectly nested loops before ‘printf’
   printf("%s: %d\n", __func__, omp_get_thread_num());
   ^~~~~~

所以collapse应该仅适用于以下场景:

#pragma omp parallel for collapse(2)
    for (int i = 0; i < 2; i++) {
        for (int i = 0; i < 4; i++) {
            printf("%s: %d\n", __func__, omp_get_thread_num());
        }
    }

是否有任何方法让嵌套函数并行运行?

0 个答案:

没有答案