我使用C ++编程并使用OpenMP进行并行化。该机器有2个CPU插槽,每个插槽有8个核心。
由于我使用intel编译器编译,因此我设置了以下环境变量
export KMP_AFFINITY=verbose,scatter
使用详细选项,我可以在运行二进制文件时看到以下消息。
[0] OMP: Info #204: KMP_AFFINITY: decoding x2APIC ids.
[0] OMP: Info #202: KMP_AFFINITY: Affinity capable, using global cpuid leaf 11 info
[0] OMP: Info #154: KMP_AFFINITY: Initial OS proc set respected: {0}
[0] OMP: Info #156: KMP_AFFINITY: 1 available OS procs
[0] OMP: Info #157: KMP_AFFINITY: Uniform topology
[0] OMP: Info #159: KMP_AFFINITY: 1 packages x 1 cores/pkg x 1 threads/core (1 total cores)
[0] OMP: Info #206: KMP_AFFINITY: OS proc to physical thread map:
[0] OMP: Info #171: KMP_AFFINITY: OS proc 0 maps to package 0
[0] OMP: Info #242: KMP_AFFINITY: pid 12759 thread 0 bound to OS proc set {0}
[0] OMP: Info #242: KMP_AFFINITY: pid 12759 thread 14 bound to OS proc set {0}
[0] OMP: Info #242: KMP_AFFINITY: pid 12759 thread 15 bound to OS proc set {0}
[0] OMP: Info #242: KMP_AFFINITY: pid 12759 thread 11 bound to OS proc set {0}
[0] OMP: Info #242: KMP_AFFINITY: pid 12759 thread 6 bound to OS proc set {0}
[0] OMP: Info #242: KMP_AFFINITY: pid 12759 thread 7 bound to OS proc set {0}
[0] OMP: Info #242: KMP_AFFINITY: pid 12759 thread 8 bound to OS proc set {0}
[0] OMP: Info #242: KMP_AFFINITY: pid 12759 thread 9 bound to OS proc set {0}
[0] OMP: Info #242: KMP_AFFINITY: pid 12759 thread 10 bound to OS proc set {0}
[0] OMP: Info #242: KMP_AFFINITY: pid 12759 thread 13 bound to OS proc set {0}
[0] OMP: Info #242: KMP_AFFINITY: pid 12759 thread 12 bound to OS proc set {0}
如您所见,OMP无法检测每个程序包的正确数量的程序包(套接字)和核心。因此,所有线程都固定在一个核心上。
如何解决此问题?我应该从哪里开始?
答案 0 :(得分:1)
我回答我的问题。
我的程序设置主线程的CPU亲和力,如下所示:
...
CPU_ZERO(&cpuset);
CPU_SET(0, &cpuset);
pid_t tid = (pid_t) syscall(SYS_gettid);
sched_setaffinity(tid, sizeof(cpu_set_t), &cpuset);
unsigned long mask = -1;
int rc = sched_getaffinity(tid, sizeof(unsigned long), (cpu_set_t*) &mask);
if (rc != 0) {
std::cout << "ERROR calling pthread_setaffinity_np; " << rc << std::endl;
abort();
}
...
在setaffinitiy系统调用全部绑定到主线程绑定的相同内核后生成的OpenMP线程。