外部内核:
dim3 block(32, 32, 1);
printf("rows = %u\n", rows);
dim3 grid(8, 8, rows);
forward_step1<<<block, grid>>>(weight_D, a_D, res1_D, columns);
内核:
unsigned int tid = blockDim.x*threadIdx.y + threadIdx.x;
unsigned int i = blockIdx.z;
unsigned int j = (gridDim.x*blockIdx.y+blockIdx.x)*blockDim.x*blockDim.y + tid;
if (j==0) printf("%u\n", i);
结果:
行= 3
0
0
0
答案 0 :(得分:2)
内核调用的语法是:
kernel<<<grid_size, block_size>>>(arguments)
您好像已经交换了grid_size
和block_size
个参数。您的网格大小为(32, 32, 1)
,块大小为(8, 8, rows)
。