我正在学习OpenMP和C,并且在简单程序方面存在一些问题。
我在bashrc中设置了以下环境变量:
define how many threads you want
export OMP_NUM_THREADS=4
#allow to switch number of threads
export OMP_DYNAMIC=true
#allow nested parallel regions
export OMP_NESTED=true
这是我正在尝试运行的程序:
#include <stdio.h> /* input, output */
#include <omp.h> /* openMP library */
#include <time.h> /* measure time */
#define N 100000000 // if sourcearray not static, I'll be overflowing the stack.
// > ~10^6 elements is a lot for most systems.
void forloop(void);
int
main(void)
{
/* worksharing: for loop */
forloop();
return(0);
}
/*=============================================================*/
/*=============================================================*/
void forloop(void){
/*do a for loop sequentially and in parallel; measure each times */
printf("=====================\n");
printf("FOR LOOP\n");
printf("=====================\n\n");
long i;
clock_t start, end;
double cpu_time_used;
static double sourcearray[N];
/*============*/
/*measure time*/
/*============*/
start=clock();
for (i=0; i<N; i++){
sourcearray[i] = ((double) (i)) * ((double) (i))/2.2034872;
}
end = clock();
cpu_time_used = ((double) (end - start)) / CLOCKS_PER_SEC;
printf("Non-parallel needed %lf s\n", cpu_time_used);
/*===============*/
/*parallel region*/
/*===============*/
#pragma omp parallel
/*need to specify num_threads, when OMP_DYNAMIC=true to make sure 4 are used.*/
{
omp_set_num_threads(4);
double starttime_omp, endtime_omp;
/*time measurement*/
starttime_omp=omp_get_wtime();
int procs, maxt, nt, id;
procs = omp_get_num_procs(); // number of processors in use
maxt = omp_get_max_threads(); // max available threads
nt = omp_get_num_threads();
id = omp_get_thread_num();
printf("num threads forloop %d from id %d, procs: %d, maxthrds: %d\n", nt, id, procs, maxt);
#pragma omp for
for (i=0; i<N; i++){
sourcearray[i] = ((double) (i)) * ((double) (i))/2.2034872;
}
endtime_omp = omp_get_wtime();
cpu_time_used = ((endtime_omp - starttime_omp)) ;
} /* end parallel region */
}
我用。编译代码 gcc -g -Wall -fopenmp -o omp_worksharing.exe omp_worksharing.c
该程序编译时发出警告,我不太明白:
omp_worksharing.c: In function ‘forloop’:
omp_worksharing.c:78:17: warning: variable ‘sourcearray’ set but not used [-Wunused-but-set-variable]
static double sourcearray[N];
但这不是主要问题:
问题是程序没有启动4个线程。这是输出:
=====================
FOR LOOP
=====================
Non-parallel needed 0.900340 s
num threads forloop 3 from id 0, procs: 8, maxthrds: 4
num threads forloop 3 from id 1, procs: 8, maxthrds: 4
num threads forloop 3 from id 2, procs: 8, maxthrds: 4
当我使用#pragma omp num_threads(4)
代替omp_set_num_threads(4);
即使更奇怪,当我遗漏#pragma omp num_threads(4)
和omp_set_num_threads(4);
时,大部分时间都会启动3个主题,但有时 4.我找不到任何规律性何时或为何,但研究表明OMP_DYNAMIC=true
允许OpenMP以最佳方式自行选择线程数。
为什么我不能指定要使用的线程数?
答案 0 :(得分:2)
在与omp_set_num_threads(4);
实际使用之前,请致电#pragma omp parallel
。