限制(amp)比CUDA内核代码更具​​限制性吗?

时间:2012-03-12 20:05:45

标签: parallel-processing cuda gpu-programming c++-amp

在C ++ AMP中,内核函数或lambda标记为restrict(amp),这对允许的C ++子集(listed here)施加了严格的限制。 CUDA是否允许内核函数中C或C ++子集的更多自由?

1 个答案:

答案 0 :(得分:18)

从Visual Studio 11和CUDA 4.1开始,restrict(amp)函数比CUDA的类似__device__函数更具限制性。最值得注意的是,AMP对如何使用指针的限制更多。这是AMP的DirectX11计算基板的自然结果,它不允许HLSL(图形着色器)代码中的指针。相比之下,CUDA的低级IR是PTX,这比HLSL更通用。

以下是逐行比较:

| VS 11 AMP restrict(amp) functions     | CUDA 4.1 sm_2x __device__ functions  |
|------------------------------------------------------------------------------|
|* can only call functions that have    |* can only call functions that have   |
|  the restrict(amp) clause             |  the __device__ decoration           |
|* The function must be inlinable       |* need not be inlined                 |
|* The function can declare only        |* Class types are allowed             |
|  POD variables                        |                                      |
|* Lambda functions cannot              |* Lambdas are not supported, but      |
|  capture by reference and             |  user functors can hold pointers     |
|  cannot capture pointers              |                                      |
|* References and single-indirection    |* References and multiple-indirection |
|  pointers are supported only as       |  pointers are supported              |
|  local variables and function         |                                      |
|* No recursion                         |* Recursion OK                        |
|* No volatile variables                |* Volatile variables OK               |
|* No virtual functions                 |* Virtual functions OK                |
|* No pointers to functions             |* Pointers to functions OK            |
|* No pointers to member functions      |* Pointers to member functions OK     |
|* No pointers in structures            |* Pointers in structures OK           |
|* No pointers to pointers              |* Pointers to pointers OK             |
|* No goto statements                   |* goto statements OK                  |
|* No labeled statements                |* Labeled statements OK               |
|* No try, catch, or throw statements   |* No try, catch, or throw statements  |
|* No global variables                  |* Global __device__ variables OK      |
|* Static variables through tile_static |* Static variables through __shared__ |
|* No dynamic_cast                      |* No dynamic_cast                     |
|* No typeid operator                   |* No typeid operator                  |
|* No asm declarations                  |* asm declarations (inline PTX) OK    |
|* No varargs                           |* No varargs                          |

您可以详细了解restrict(amp)的限制here。您可以在CUDA C Programming Guide的附录D中阅读CUDA __device__函数中的C ++支持。