我正在使用openmp来改进monte carlo方法以找到PI。我所做的是将pragma子句添加到顺序代码中。代码如下。
float host_monte_carlo_parallel(long trials, int noOfThreads) {
float x, y;
long points_in_circle;
long i;
#pragma omp parallel for num_threads(noOfThreads) private(i, x, y) reduction(+:points_in_circle)
for (i = 0; i < trials; i++) {
x = rand() / (float) RAND_MAX;
y = rand() / (float) RAND_MAX;
//printf("%ld\n", i);
points_in_circle += (x * x + y * y <= 1.0f);
}
return 4.0f * points_in_circle / trials;
}
问题在于顺序代码的运行时间远早于并行代码。我使用的pragma正确吗?运行时间大致相同。
CPU pi calculated in 6.413644 s.
CPU parallel pi calculated in 203.746460 s.