我在主要代码中有这个代码:
int main(int argv, char **argc)
{
// Get multi-CPU/multi-GPU data
int num_gpus;
cudaGetDeviceCount(&num_gpus);
printf("### Number of host CPUs:\t%d\n", omp_get_num_procs());
printf("### Number of CUDA devices:\t%d\n", num_gpus);
omp_set_num_threads(num_gpus);
#pragma omp parallel
{
unsigned int cpu_thread_id = omp_get_thread_num();
printf("### (CPU thread %d)\n",cpu_thread_id);
test(cpu_thread_id, num_gpus);
}
}
和测试功能:
void test(unsigned int cpu_thread_id, int num_gpus)
{
printf("### Using CUDA device %d, GPU = %d\n", cpu_thread_id, num_gpus);
}
即使我得到了输出:
### Number of host CPUs: 8
### Number of CUDA devices: 4
### (CPU thread 0)
### Using CUDA device 0, GPU = 4
但我希望有更多这样的主题:
### (CPU thread 1)
### Using CUDA device 1, GPU = 4
### (CPU thread 2)
### Using CUDA device 2, GPU = 4
### (CPU thread 3)
### Using CUDA device 3, GPU = 4
为什么其他三个线程没有运行?
提前致谢。