简单的OpenMP saxpy加速不好

时间:2015-11-25 19:11:02

标签: c parallel-processing openmp

我无法使用简单的SAXPY程序来使用OpenMP正确扩展其性能。

#include <stdio.h>
#include <stdlib.h>
#include <omp.h>

int main(int argc, char** argv){
    int N = atoi(argv[1]), threads = atoi(argv[2]), i;
    omp_set_num_threads(threads);
    double a = 3.141592, *x, *y, t1, t2;
    x = (double*)malloc(sizeof(double)*N);
    y = (double*)malloc(sizeof(double)*N);

    for(i = 0; i < N; ++i){
        x[i] = y[i] = (double)i;
    }

    t1 = omp_get_wtime();
    #pragma omp parallel for default(none) private(i) shared(a, N, x,y)
    for(i = 0; i < N; ++i){
        y[i] = a*x[i] + y[i];
    }
    t2 = omp_get_wtime();

    printf("%f secs\n", t2-t1);
}

我正在编译为:

gcc main.c -lm -O3 -fopenmp -o prog

我获得的10M元素的表现是:

threads = 1  0.015097 secs
threads = 2  0.013954 secs

知道我遇到的问题是什么?

1 个答案:

答案 0 :(得分:1)

您忘记了for指令中的#pragma omp

#pragma omp parallel for default(none) private(i) shared(a, N, x,y)

没有for没有工作共享,每个线程将在整个范围内迭代[1,N]