Question

示例：

#include <cuda.h>
#include <stdint.h>
#include <assert.h>

__constant__ int32_t m;

int main(int argc, char* argv[])
{
    void* s;
    int r = cudaGetSymbolAddress( &s, m);
    assert( r == cudaSuccess );
    return 0;
}

编译：

$ nvcc test.cu -o test -arch compute_20 -code sm_20

执行命令

$ ./test

得到：

test: test.cu:15: int main(int, char**): Assertion `r == cudaSuccess' failed.
Aborted (core dumped)

（如果这有任何区别，我在两台不同的计算机上用两张不同的卡进行了测试。在两种情况下都是Cuda 6。）

那里有什么问题？

Answer 1

正如@ sgar91指出的那样，问题是编译目标与实际的GPU不匹配。

具体一点：您的选项中有-code sm_20，这将使编译器为sm_20构建二进制文件，并且二进制文件中不会有PTX - 这意味着它不能为您的设备进行JIT编译（计算能力）＆gt; 2.0）因此您的GPU操作将失败。您应该有-code compute_20或一个或多个-gencode参数（有关更多示例，请参阅nvcc manual）。

一些例子：

$ nvcc test.cu -o test -arch compute_20 -code compute_20
$ nvcc test.cu -o test -gencode="arch=compute_20,code=\"compute_20,sm_20,sm_30\""
$ nvcc test.cu -o test -gencode="arch=compute_20,code=\"sm_20,sm_21\"" -gencode="arch=compute_30,code=\"compute_30,sm_30\""

您应该报告实际错误，而不是在您的CUDA API调用上执行断言，因为这会对此有所帮助。

不能在Cuda中使用常量

1 个答案: