openMP的性能提升

时间:2015-09-23 18:41:33

标签: c performance parallel-processing openmp

以下是找到某个给定随机数的因子的程序。与串行相比,并行性能,即使对于大输入,串行性能也要好得多。什么应该是使用openmp提高性能的逻辑以及如何进一步优化openmp并行化代码。

代码 -

#include <stdio.h>
#include <stdlib.h>
#include <omp.h>
#include <time.h>

int main( )
{
    int i,j,k,num,thread;
    int *arr,*result,temp;
    time_t t;
    srand((unsigned)time(&t));
    scanf("%d",&num);
    arr = (int*)malloc(sizeof(int)*num);
    result = (int*)malloc(sizeof(int)*num);

    for(i=0;i<num;i++){
        arr[i]=rand()%10;
    }

    for(i=0;i<num;i++){
        result[i]=1;
    }   

    clock_t begin, end;
    double time_spent_omp;
    double time_spent;

    begin = clock();
    /* here, do your time-consuming job */

        #pragma omp parallel for private(temp)
        for(j=0;j<num;j++){
            temp = arr[j];
            for(i=0;i<temp;temp--)
            result[j]=result[j]*temp;
        }   


    end = clock();
    time_spent_omp = (double)(end - begin) / CLOCKS_PER_SEC;

    /*
    for(i=0;i<num;i++){
        printf("%d\t%d\n",arr[i],result[i]);
    }*/

    for(i=0;i<num;i++){
        result[i]=1;
    }   

    begin = clock();

    for(j=0;j<num;j++){
        temp = arr[j];
        for(i=0;i<temp;temp--)
        result[j]=result[j]*temp;
    }

    end = clock();
    time_spent = (double)(end - begin)/ CLOCKS_PER_SEC;

    /*
    for(i=0;i<num;i++){
        printf("%d\t%d\n",arr[i],result[i]);
    }*/

    printf("Time for serial is %f\nTime for openMP is %f\n",time_spent, time_spent_omp);

    return 0;
}

输出 -

rnt@rnt-laptop:~/Desktop/C$ gcc -fopenmp -o fact fact.c
rnt@rnt-laptop:~/Desktop/C$ ./fact 
5
Time for serial is 0.000004
Time for openMP is 0.006214
rnt@rnt-laptop:~/Desktop/C$ ./fact 
11
Time for serial is 0.000013
Time for openMP is 0.000391
rnt@rnt-laptop:~/Desktop/C$ ./fact 
111
Time for serial is 0.000078
Time for openMP is 0.000507
rnt@rnt-laptop:~/Desktop/C$ ./fact 
1111
Time for serial is 0.000454
Time for openMP is 0.000860
rnt@rnt-laptop:~/Desktop/C$ ./fact 
11111
Time for serial is 0.002947
Time for openMP is 0.004829
rnt@rnt-laptop:~/Desktop/C$ ./fact 
111111
Time for serial is 0.022903
Time for openMP is 0.044273
rnt@rnt-laptop:~/Desktop/C$ ./fact 
1111111
Time for serial is 0.030446
Time for openMP is 0.160402
rnt@rnt-laptop:~/Desktop/C$ ./fact 
11111111
Time for serial is 0.298610
Time for openMP is 1.580710
rnt@rnt-laptop:~/Desktop/C$ ./fact 
111111111
Time for serial is 2.993646
Time for openMP is 13.202524

1 个答案:

答案 0 :(得分:0)

尝试使用:

#pragma omp parallel for private(temp) schedule(static,XX) 

有几个XX的值,如10,50 100,1000等...

默认情况下,OpenMP使用动态调度,这对于在迭代/核心之间存在平衡问题的并行化代码更好。

编辑:您需要将循环迭代器作为私有变量。否则它将重复工作。您可以尝试不同的调度程序并改变参数来调整线程性能......