以下是找到某个给定随机数的因子的程序。与串行相比,并行性能,即使对于大输入,串行性能也要好得多。什么应该是使用openmp提高性能的逻辑以及如何进一步优化openmp并行化代码。
代码 -
#include <stdio.h>
#include <stdlib.h>
#include <omp.h>
#include <time.h>
int main( )
{
int i,j,k,num,thread;
int *arr,*result,temp;
time_t t;
srand((unsigned)time(&t));
scanf("%d",&num);
arr = (int*)malloc(sizeof(int)*num);
result = (int*)malloc(sizeof(int)*num);
for(i=0;i<num;i++){
arr[i]=rand()%10;
}
for(i=0;i<num;i++){
result[i]=1;
}
clock_t begin, end;
double time_spent_omp;
double time_spent;
begin = clock();
/* here, do your time-consuming job */
#pragma omp parallel for private(temp)
for(j=0;j<num;j++){
temp = arr[j];
for(i=0;i<temp;temp--)
result[j]=result[j]*temp;
}
end = clock();
time_spent_omp = (double)(end - begin) / CLOCKS_PER_SEC;
/*
for(i=0;i<num;i++){
printf("%d\t%d\n",arr[i],result[i]);
}*/
for(i=0;i<num;i++){
result[i]=1;
}
begin = clock();
for(j=0;j<num;j++){
temp = arr[j];
for(i=0;i<temp;temp--)
result[j]=result[j]*temp;
}
end = clock();
time_spent = (double)(end - begin)/ CLOCKS_PER_SEC;
/*
for(i=0;i<num;i++){
printf("%d\t%d\n",arr[i],result[i]);
}*/
printf("Time for serial is %f\nTime for openMP is %f\n",time_spent, time_spent_omp);
return 0;
}
输出 -
rnt@rnt-laptop:~/Desktop/C$ gcc -fopenmp -o fact fact.c
rnt@rnt-laptop:~/Desktop/C$ ./fact
5
Time for serial is 0.000004
Time for openMP is 0.006214
rnt@rnt-laptop:~/Desktop/C$ ./fact
11
Time for serial is 0.000013
Time for openMP is 0.000391
rnt@rnt-laptop:~/Desktop/C$ ./fact
111
Time for serial is 0.000078
Time for openMP is 0.000507
rnt@rnt-laptop:~/Desktop/C$ ./fact
1111
Time for serial is 0.000454
Time for openMP is 0.000860
rnt@rnt-laptop:~/Desktop/C$ ./fact
11111
Time for serial is 0.002947
Time for openMP is 0.004829
rnt@rnt-laptop:~/Desktop/C$ ./fact
111111
Time for serial is 0.022903
Time for openMP is 0.044273
rnt@rnt-laptop:~/Desktop/C$ ./fact
1111111
Time for serial is 0.030446
Time for openMP is 0.160402
rnt@rnt-laptop:~/Desktop/C$ ./fact
11111111
Time for serial is 0.298610
Time for openMP is 1.580710
rnt@rnt-laptop:~/Desktop/C$ ./fact
111111111
Time for serial is 2.993646
Time for openMP is 13.202524
答案 0 :(得分:0)
尝试使用:
#pragma omp parallel for private(temp) schedule(static,XX)
有几个XX的值,如10,50 100,1000等...
默认情况下,OpenMP使用动态调度,这对于在迭代/核心之间存在平衡问题的并行化代码更好。
编辑:您需要将循环迭代器作为私有变量。否则它将重复工作。您可以尝试不同的调度程序并改变参数来调整线程性能......