我一直在尝试使用OpenMP并行化以下代码,但没有成功。 我在互联网上搜索了几个例子,但是在执行程序几次之后,他们都没有给我相同的答案。
#include <stdio.h>
#include <omp.h>
#define NUM_THREADS 2
long num_steps = 100000;
double step = 1.0/100000.0;
int main() {
int i;
double x, pi, sum = 0.0;
for(i = 0; i < num_steps; ++i) {
x = (i-0.5)*step;
sum += 4.0/(1.0+x*x);
}
pi = step*sum;
printf("PI value = %f\n", pi);
}
这是我到目前为止的解决方案:
int main (int argc, char **argv){
//Variables
int i=0, aux=0;
double step = 1.0/100000.0;
double x=0.0,
pi=0.0,
sum = 0.0;
#pragma omp parallel shared(sum,i) private(x)
{
x = 0.0;
sum = 0.0;
#pragma omp for
for (i=0; i<num_steps; ++i) {
x = (i-0.5)*step;
#pragma omp critical
sum += 4.0/(1.0+x*x);
}
}
/* All threads join master thread and terminate */
pi= step*sum;
printf("PI value = %f\n", pi);
}
答案 0 :(得分:0)
请考虑使用与OpenMP官方网站中提到的相同的循环说明:loop parallelism,我必须在代码中更改许多行,我的目标不是给你一个洞代码,但我认为这是帮助/教程然后我发布完整的程序。希望它能成为您在C语言中更熟悉OpenMP和循环并行性的起点。
#include <stdio.h>
#include <omp.h>
#define NUM_STEPS 10000000
int main (int argc, char **argv){
//Variables
long int i, num_steps = NUM_STEPS;
double x, step, sum, pi;
sum = 0.0;
step = 1.0 / (double) num_steps;
#pragma omp parallel private(i,x)
{
#pragma omp for reduction(+:sum)
for (i=0; i<num_steps; ++i) {
x = (i+0.5)*steps;
sum += 4.0/(1.0+x*x);
}
}
/* All threads join master thread and terminate */
pi= steps*sum;
printf("PI value = %.24f\n", pi);
答案 1 :(得分:-1)
答案是:
#include <omp.h>
#include <stdio.h>
#include <stdlib.h>
long num_steps = 100000;
double step = 1.0/100000.0;
int main() {
int i;
double x, pi, sum = 0.0;
#pragma omp parallel private(x)
{
#pragma omp for reduction(+:sum)
for(i = 0; i < num_steps; ++i) {
x = (i-0.5)*step;
sum += 4.0/(1.0+x*x);
}
}
pi = step*sum;
printf("PI value = %f\n", pi);
}
答案 2 :(得分:-1)
您的主要问题是您将循环索引i
声明为 shared 。这导致每个线程在评估中使用相同的i
。你真正想要用OpenMP做的是将i
的整个范围除以分数,并为每个线程分配不同的分数。因此,请将i
指定为private
。
除此之外,您无需在并行区域重新初始化x
和sum
。修复一些不相关的编译错误后,您的代码应如下所示:
#include<stdio.h>
#include <omp.h>
#define NUM_THREADS 2
int main (int argc, char **argv){
//Variables
int i=0, aux=0;
double step = 1.0/100000.0;
long num_steps = 100000;
double x=0.0,
pi=0.0,
sum = 0.0;
#pragma omp parallel shared(sum) private(i,x)
{
#pragma omp for
for (i=0; i<num_steps; ++i) {
x = (i-0.5)*step;
#pragma omp critical
sum += 4.0/(1.0+x*x);
}
}
/* All threads join master thread and terminate */
pi= step*sum;
printf("PI value = %f\n", pi);
}
请记住,这在性能方面远非完美,因为每次要更新sum
时,都会暂停整个并行区域。让代码更快的第一步是删除critical
部分并将sum
声明为reduction
代替:
#pragma omp parallel private(i,x)
{
#pragma omp for reduction(+:sum)
for (i=0; i<num_steps; ++i) {
x = (i-0.5)*step;
sum += 4.0/(1.0+x*x);
}
}