如何找出哪个线程在GPU的哪个核心上执行?

时间:2015-03-05 15:26:16

标签: cuda gpu nvidia

我正在Cuda开发一些简单的程序,我想知道哪个线程正在执行GPU的哪个核心。我使用 Visual Studio 2012 ,我有 NVIDIA GeForce 610M显卡

是否有可能这样做...我已经在谷歌上搜索了很多但都是徒劳的。

编辑

我知道这真的很奇怪,但我的大学项目指南要求我这样做。

1 个答案:

答案 0 :(得分:6)

结合PTX manual和简单inline-PTX wrapper的信息,以下功能应该可以满足您的需求:

static __device__ __inline__ uint32_t __mysmid(){    
  uint32_t smid;    
  asm volatile("mov.u32 %0, %%smid;" : "=r"(smid));    
  return smid;}

上述函数将告诉您(线程)代码正在执行哪个多处理器。

static __device__ __inline__ uint32_t __mywarpid(){    
  uint32_t warpid;    
  asm volatile("mov.u32 %0, %%warpid;" : "=r"(warpid));    
  return warpid;}

上面的函数会告诉你(线程)代码属于哪个warp。

static __device__ __inline__ uint32_t __mylaneid(){    
  uint32_t laneid;    
  asm volatile("mov.u32 %0, %%laneid;" : "=r"(laneid));    
  return laneid;}

上面的函数会告诉你(线程)代码属于哪个warp lane。

请注意,在动态并行(以及可能的其他方案,如调试)的情况下,此信息为volatile and may change during program execution

有关多处理器 warp 等术语的定义,请参阅programming guide

这是一个完整的例子:

$ cat t646.cu
#include <stdio.h>
#include <stdint.h>

static __device__ __inline__ uint32_t __mysmid(){
  uint32_t smid;
  asm volatile("mov.u32 %0, %%smid;" : "=r"(smid));
  return smid;}

static __device__ __inline__ uint32_t __mywarpid(){
  uint32_t warpid;
  asm volatile("mov.u32 %0, %%warpid;" : "=r"(warpid));
  return warpid;}

static __device__ __inline__ uint32_t __mylaneid(){
  uint32_t laneid;
  asm volatile("mov.u32 %0, %%laneid;" : "=r"(laneid));
  return laneid;}


__global__ void mykernel(){

  int idx = threadIdx.x+blockDim.x*blockIdx.x;
  printf("I am thread %d, my SM ID is %d, my warp ID is %d, and my warp lane is %d\n", idx, __mysmid(), __mywarpid(), __mylaneid());
}

int main(){

  mykernel<<<4,4>>>();
  cudaDeviceSynchronize();
  return 0;
}
$ nvcc -arch=sm_20 -o t646 t646.cu
$ ./t646
I am thread 0, my SM ID is 0, my warp ID is 0, and my warp lane is 0
I am thread 1, my SM ID is 0, my warp ID is 0, and my warp lane is 1
I am thread 2, my SM ID is 0, my warp ID is 0, and my warp lane is 2
I am thread 3, my SM ID is 0, my warp ID is 0, and my warp lane is 3
I am thread 8, my SM ID is 3, my warp ID is 0, and my warp lane is 0
I am thread 9, my SM ID is 3, my warp ID is 0, and my warp lane is 1
I am thread 10, my SM ID is 3, my warp ID is 0, and my warp lane is 2
I am thread 11, my SM ID is 3, my warp ID is 0, and my warp lane is 3
I am thread 12, my SM ID is 4, my warp ID is 0, and my warp lane is 0
I am thread 13, my SM ID is 4, my warp ID is 0, and my warp lane is 1
I am thread 14, my SM ID is 4, my warp ID is 0, and my warp lane is 2
I am thread 15, my SM ID is 4, my warp ID is 0, and my warp lane is 3
I am thread 4, my SM ID is 1, my warp ID is 0, and my warp lane is 0
I am thread 5, my SM ID is 1, my warp ID is 0, and my warp lane is 1
I am thread 6, my SM ID is 1, my warp ID is 0, and my warp lane is 2
I am thread 7, my SM ID is 1, my warp ID is 0, and my warp lane is 3
$

请注意,上述输出会根据您运行的GPU类型而有所不同。不要指望你的输出与上面完全一样。