美好的一天。 我想用3种方法实现内部产品: 1-顺序 2-半平行 3-全平行
半并行是指并行相乘,相继求和。
这是我的代码:
int main(int argc, char *argv[]) {
int *x, *y, *z, *w, xy_p, xy_s, xy_ss, i, N=5000;
double s, e;
x = (int *) malloc(sizeof(int)*N);
y = (int *) malloc(sizeof(int)*N);
z = (int *) malloc(sizeof(int)*N);
w = (int *) malloc(sizeof(int)*N);
for(i=0; i < N; i++) {
x[i] = rand();
y[i] = rand();
z[i] = 0;
}
s = omp_get_wtime();
xy_ss = 0;
for(i=0; i < N; i++)
{
xy_ss += x[i] * y[i];
}
e = omp_get_wtime() - s;
printf ( "[**] Sequential execution time is:\n%15.10f and <A,B> is %d\n", e, xy_ss );
s = omp_get_wtime();
xy_s = 0;
#pragma omp parallel for shared ( N, x, y, z ) private ( i )
for(i = 0; i < N; i++)
{
z[i] = x[i] * y[i];
}
for(i=0; i < N; i++)
{
xy_s += z[i];
}
e = omp_get_wtime() - s;
printf ( "[**] Half-Parallel execution time is:\n%15.10f and <A,B> is %d\n", e, xy_s );
s = omp_get_wtime();
xy_p = 0;
# pragma omp parallel shared (N, x, y) private(i)
# pragma omp for reduction ( + : xy_p )
for(i = 0; i < N; i++)
{
xy_p += x[i] * y[i];
}
e = omp_get_wtime() - s;
printf ( "[**] Full-Parallel execution time is:\n%15.10f and <A,B> is %d\n", e, xy_p );
}
所以我有一个问题: 首先我想知道:我的代码正确吗? 第二:为什么半并行比顺序快?! 第三:5000是否适合并行处理? 最后为什么连续是最快的?因为有5000? 样本输出:
顺序执行时间为: 0.0000196100,点是-1081001655
半并行执行时间为: 0.0090819710,点是-1081001655
完全并行执行时间为: 0.0080959420,点是-1081001655
并且对于N = 5000000
顺序执行时间为: 0.0150297650,是-1629514371
半并行执行时间为: 0.0292110600,是-1629514371
完全并行执行时间为: 0.0072323760,是-1629514371
无论如何,为什么半平行是最慢的?