我正在Cuda开发一些简单的程序,我想知道哪个线程正在执行GPU的哪个核心。我使用 Visual Studio 2012 ,我有 NVIDIA GeForce 610M显卡。
是否有可能这样做...我已经在谷歌上搜索了很多但都是徒劳的。
编辑:
我知道这真的很奇怪,但我的大学项目指南要求我这样做。
答案 0 :(得分:6)
结合PTX manual和简单inline-PTX wrapper的信息,以下功能应该可以满足您的需求:
static __device__ __inline__ uint32_t __mysmid(){
uint32_t smid;
asm volatile("mov.u32 %0, %%smid;" : "=r"(smid));
return smid;}
上述函数将告诉您(线程)代码正在执行哪个多处理器。
static __device__ __inline__ uint32_t __mywarpid(){
uint32_t warpid;
asm volatile("mov.u32 %0, %%warpid;" : "=r"(warpid));
return warpid;}
上面的函数会告诉你(线程)代码属于哪个warp。
static __device__ __inline__ uint32_t __mylaneid(){
uint32_t laneid;
asm volatile("mov.u32 %0, %%laneid;" : "=r"(laneid));
return laneid;}
上面的函数会告诉你(线程)代码属于哪个warp lane。
请注意,在动态并行(以及可能的其他方案,如调试)的情况下,此信息为volatile and may change during program execution。
有关多处理器和 warp 等术语的定义,请参阅programming guide。
这是一个完整的例子:
$ cat t646.cu
#include <stdio.h>
#include <stdint.h>
static __device__ __inline__ uint32_t __mysmid(){
uint32_t smid;
asm volatile("mov.u32 %0, %%smid;" : "=r"(smid));
return smid;}
static __device__ __inline__ uint32_t __mywarpid(){
uint32_t warpid;
asm volatile("mov.u32 %0, %%warpid;" : "=r"(warpid));
return warpid;}
static __device__ __inline__ uint32_t __mylaneid(){
uint32_t laneid;
asm volatile("mov.u32 %0, %%laneid;" : "=r"(laneid));
return laneid;}
__global__ void mykernel(){
int idx = threadIdx.x+blockDim.x*blockIdx.x;
printf("I am thread %d, my SM ID is %d, my warp ID is %d, and my warp lane is %d\n", idx, __mysmid(), __mywarpid(), __mylaneid());
}
int main(){
mykernel<<<4,4>>>();
cudaDeviceSynchronize();
return 0;
}
$ nvcc -arch=sm_20 -o t646 t646.cu
$ ./t646
I am thread 0, my SM ID is 0, my warp ID is 0, and my warp lane is 0
I am thread 1, my SM ID is 0, my warp ID is 0, and my warp lane is 1
I am thread 2, my SM ID is 0, my warp ID is 0, and my warp lane is 2
I am thread 3, my SM ID is 0, my warp ID is 0, and my warp lane is 3
I am thread 8, my SM ID is 3, my warp ID is 0, and my warp lane is 0
I am thread 9, my SM ID is 3, my warp ID is 0, and my warp lane is 1
I am thread 10, my SM ID is 3, my warp ID is 0, and my warp lane is 2
I am thread 11, my SM ID is 3, my warp ID is 0, and my warp lane is 3
I am thread 12, my SM ID is 4, my warp ID is 0, and my warp lane is 0
I am thread 13, my SM ID is 4, my warp ID is 0, and my warp lane is 1
I am thread 14, my SM ID is 4, my warp ID is 0, and my warp lane is 2
I am thread 15, my SM ID is 4, my warp ID is 0, and my warp lane is 3
I am thread 4, my SM ID is 1, my warp ID is 0, and my warp lane is 0
I am thread 5, my SM ID is 1, my warp ID is 0, and my warp lane is 1
I am thread 6, my SM ID is 1, my warp ID is 0, and my warp lane is 2
I am thread 7, my SM ID is 1, my warp ID is 0, and my warp lane is 3
$
请注意,上述输出会根据您运行的GPU类型而有所不同。不要指望你的输出与上面完全一样。