我必须在OMP中并行化第一个for循环,但在其中有一个由于数据依赖性而无法并行化的for循环。我尝试在外部进行并行处理,但指针存在问题。
问题的最小例子:
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include <omp.h>
int main()
{
int N = 5;
int size = 6;
int n, j, i;
double t[] = {1,2,3,4,5,6};
double z, h2M, R2M, dz;
int *dynamic_d;
int *dynamic_A;
int *dynamic_B;
int *output;
dynamic_d = (int *) calloc (N+1, sizeof(int));
for(i = 0; i < N+1; i++){
*(dynamic_d + i) = i;
}
dynamic_A = (int*) calloc (N+2, sizeof(int));
dynamic_B = (int*) calloc (N+2, sizeof(int));
output = (int*) calloc (size, sizeof(int));
for (j = 0; j < size; j++) {
z = t[j] + 1;
*dynamic_A = 0;
*dynamic_B = 1;
*(dynamic_A + 1) = *dynamic_d;
*(dynamic_B + 1) = 1;
for (n = 2; n <= N+1; n++) {
dz = *(dynamic_d + n-1)*z;
*(dynamic_A + n) = *(dynamic_A + n-1) + dz + (*(dynamic_A + n-2));
*(dynamic_B + n) = *(dynamic_B + n-1) + dz + (*(dynamic_B + n-2));
}
h2M = z + *(dynamic_d + N-1) - *(dynamic_d + N);
R2M = -h2M + z + *(dynamic_d + N);
*(dynamic_A + N+1) = *(dynamic_A + N) + R2M + *(dynamic_A + N-1);
*(dynamic_B + N+1) = *(dynamic_B + N) + R2M + *(dynamic_B + N-1);
*(output + j) = t[j] + *(dynamic_A + N+1) + *(dynamic_B + N+1);
}
printf("\n\noutput:\n");
for (j = 0; j < size; j++){
printf("| %d ", output[j]);
}
printf("\n");
return 0;
}
答案 0 :(得分:0)
唯一的数据依赖是两个数组dynamic_A
和dynamic_B
,因为它们是唯一一个在循环中写入和读取的数组。 dynamic_d
仅被阅读且output
仅被写入(因此没有问题)。
但是,如果仔细查看dynamic_A
和dynamic_B
依赖项,您可以看到它们不是循环传输的,因为迭代dynamic_A[i]
中计算的j
的任何值都是仅在该迭代中使用。整个数组将在最外层循环的下一次迭代中被覆盖。
您需要重写代码,以便每个线程都有自己的dynamic_A
和dynamic_B
私有副本。例如:
#pragma omp parallel private(dynamic_A, dynamic_B, z, h2M, R2M)
{
dynamic_A = (int*) calloc (N+2, sizeof(int));
dynamic_B = (int*) calloc (N+2, sizeof(int));
#pragma omp for
for (j = 0; j < size; j++) {
...
}
free(dynamic_A);
free(dynamic_B);
}