我在CUDA中使用了全局2d数组变量,并且我尝试对此全局变量使用累积加法。但是当我重新运行代码时,它从上次运行的值开始。例如,如果上次运行中的值为50,则下一次运行将显示100.它不会将值重置为0.
__device__ double *d_t;
__device__ size_t d_gridPitch;
__global__ void kernelFunc()
{
int i = blockIdx.x * blockDim.x + threadIdx.x
double* rowt = (double*)((char *)d_t + i * d_gridPitch);
rowt[0] = rowt[0] + 50000;
printf("%.0f, ",rowt[0]);
}
int main()
{
int size = 16;
size_t d_pitchLoc;
double *d_tLoc;
cudaMallocPitch((void**)&d_tLoc, &d_pitchLoc, size * sizeof(double), size);
cudaMemcpyToSymbol(d_gridPitch, &d_pitchLoc, sizeof(int));
cudaMemcpyToSymbol(d_t, & d_tLoc, sizeof(d_tLoc));
kernelFunc<<<1,size>>>();
cudaDeviceReset();
return 0;
}
答案 0 :(得分:1)
听起来你想要的是初始化你正在分配的内存(请注意,这与#34;重置变量&#34;)无关。为此,请使用cudaMemset2D初始化cudaMallocPitch
返回的内存分配中的字节。所以你的主机API序列如下所示:
int size = 16;
size_t d_pitchLoc;
double *d_tLoc;
cudaMallocPitch((void**)&d_tLoc, &d_pitchLoc, size * sizeof(double), size);
cudaMemset2D(d_tLoc, d_pitchLoc, 0, size * sizeof(double), size);
cudaMemcpyToSymbol(d_gridPitch, &d_pitchLoc, sizeof(int));
cudaMemcpyToSymbol(d_t, & d_tLoc, sizeof(d_tLoc));
请注意cudaMemset2D
,如cudaMemset
,sets bytes来自提供的int值的LSB。