CUDA设备上的类型大小理论上与主机平台上的大小不同。那么,在我的CUDA设备上表达&#34; sizeof(T)的惯用方式&#34;,在代码中 - 除了将你自己的id类型地图滚动到你知道的值之外?&lt; < / p>
答案 0 :(得分:2)
任何当前支持的CUDA平台都不需要您提出任何问题。 CUDA工具链与主机编译器和主机C ++运行时库高度集成的原因之一是,保证主机和设备上基本类型的大小始终匹配。不需要惯用的大小翻译。 sizeof
的结果对于主机和设备始终是相同的。请注意,基本类型的大小可能因平台而异(Windows是LLP64 / IL32P64平台,Linux和OS X是LP64 / I32LP64平台),但这对GPU没有影响。
另请注意,GPU可以对复合类型强加对齐要求,这可能意味着编译后的大小与您的预期不同。适用的条件在文档中有详细讨论。
例如,请考虑以下简单的示例代码:
#include <cstdio>
__device__ __host__ __noinline__ void printsizes(const char* title)
{
printf("%s\n", title);
printf("sizeof(void*) = %ld\n", (unsigned long)sizeof(void*));
printf("sizeof(char) = %ld\n", (unsigned long)sizeof(char));
printf("sizeof(bool) = %ld\n", (unsigned long)sizeof(bool));
printf("sizeof(short) = %ld\n", (unsigned long)sizeof(short));
printf("sizeof(int) = %ld\n", (unsigned long)sizeof(int));
printf("sizeof(long) = %ld\n", (unsigned long)sizeof(long));
printf("sizeof(long long) = %ld\n", (unsigned long)sizeof(long long));
}
__global__ void printkernel()
{
printsizes("On the device:");
}
int main()
{
printsizes("On the host:");
printkernel<<<1,1>>>();
cudaDeviceSynchronize();
cudaDeviceReset();
return 0;
}
在Linux 64平台上编译并运行产生:
$ nvcc -arch=sm_52 -m64 -o sizeof64 sizeof.cu
$ ./sizeof64
On the host:
sizeof(void*) = 8
sizeof(char) = 1
sizeof(bool) = 1
sizeof(short) = 2
sizeof(int) = 4
sizeof(long) = 8
sizeof(long long) = 8
On the device:
sizeof(void*) = 8
sizeof(char) = 1
sizeof(bool) = 1
sizeof(short) = 2
sizeof(int) = 4
sizeof(long) = 8
sizeof(long long) = 8
基于64位Windows平台构建它:
>nvcc -arch=sm_21 -m64 sizes.cu
sizes.cu
Creating library a.lib and object a.exp
>a.exe
On the host:
sizeof(void*) = 8
sizeof(char) = 1
sizeof(bool) = 1
sizeof(short) = 2
sizeof(int) = 4
sizeof(long) = 4
sizeof(long long) = 8
On the device:
sizeof(void*) = 8
sizeof(char) = 1
sizeof(bool) = 1
sizeof(short) = 2
sizeof(int) = 4
sizeof(long) = 4
sizeof(long long) = 8
基于32位Windows平台构建它:
>nvcc -arch=sm_21 -m32 sizes.cu
sizes.cu
Creating library a.lib and object a.exp
C:\Users\david\Documents>a.exe
On the host:
sizeof(void*) = 4
sizeof(char) = 1
sizeof(bool) = 1
sizeof(short) = 2
sizeof(int) = 4
sizeof(long) = 4
sizeof(long long) = 8
On the device:
sizeof(void*) = 4
sizeof(char) = 1
sizeof(bool) = 1
sizeof(short) = 2
sizeof(int) = 4
sizeof(long) = 4
sizeof(long long) = 8
请注意,void *
和long
的大小可能因平台而异。但在每种情况下,GPU大小都与主机大小相匹配。这是CUDA驱动程序和GPU运行时的基本设计原则。