有没有办法从使用CUDA的二进制文件中获取所需的计算能力?我知道该应用程序适用于特定的图形卡(具有计算能力2.1)。
答案 0 :(得分:3)
正在运行cuobjdump
可以帮到你。它将告诉您在编译文件中可以使用 ptx (运行时jit编译的代码)以及 sass (在特定设备上执行的实际代码)预编译也是如此。下面是使用-arch=sm_20
编译的设备代码的示例输出:
$ cuobjdump quick
Fatbin elf code:
================
arch = sm_20
code version = [1,7]
producer = <unknown>
host = linux
compile_size = 64bit
identifier = quick.cu
Fatbin elf code:
================
arch = sm_20
code version = [1,7]
producer = cuda
host = linux
compile_size = 64bit
identifier = quick.cu
Fatbin ptx code:
================
arch = sm_20
code version = [4,1]
producer = cuda
host = linux
compile_size = 64bit
compressed
identifier = quick.cu
ptxasOptions = --generate-line-info