鉴于此CUDA代码,我正在尝试执行位移操作,并且这些值的返回值为零。这不应该发生。有谁知道如何解决这个问题?我错过了CUDA标题包含吗?
代码
__device__ unsigned int FI( unsigned int in_data, unsigned int subkey,
unsigned int *KLi1, unsigned int *KLi2, unsigned int *KOi1, unsigned int *KOi2,
unsigned int *KOi3, unsigned int *KIi1, unsigned int *KIi2, unsigned int *KIi3) {
unsigned int nine, seven;
unsigned int S7[128] = {};
unsigned int S9[512] = {};
nine = (in_data>>7);
seven = (in_data&0x7F);
/* Now run the various operations */
nine = (unsigned int)(S9[nine] ^ seven);
seven = (unsigned int)(S7[seven] ^ (nine & 0x7F));
seven ^= (subkey>>9);
nine ^= (subkey&0x1FF);
nine = (unsigned int)(S9[nine] ^ seven);
seven = (unsigned int)(S7[seven] ^ (nine & 0x7F));
in_data = (unsigned int)((seven<<9) + nine);
return( in_data );
}
断点分析
这是一个代码片段的示例,它将unsigned int 7个位置向右移动。当我在指令处cuda -gdb我的exec和断点时,我发现移位后的值在不应该的时候保持为零。当我通常在cuda-gdb命令提示符中执行相同的操作时,我得到一个非零值。有任何建议或提示吗?
根据in_data
的值,变量9和7应该是非空的。
nine = (in_data>>7);
seven = (in_data&0x7F);
[Switching focus to CUDA kernel 0, grid 1, block (0,0,0), thread (1,0,0), device 0, sm 0, warp 0, lane 1]
Breakpoint 1, FI (KLi1=0x3fffae0, KLi2=0x3fffb00, KOi1=0x3fffb20, KOi2=0x3fffb40, KOi3=0x3fffb60,
KIi1=0x3fffb80, KIi2=0x3fffba0, KIi3=0x3fffbc0, in_data=461, subkey=0) at kasumiOp.cu:61
61 nine = (in_data>>7);
(cuda-gdb) p in_data
$1 = 461
(cuda-gdb) step
62 seven = (in_data&0x7F);
(cuda-gdb) p nine
$2 = 0
(cuda-gdb) step
65 nine = (unsigned int)(S9[nine] ^ seven);
(cuda-gdb) p seven
$3 = 0
(cuda-gdb) p 461 >> 7
$4 = 3
(cuda-gdb) cuda thread
thread (1,0,0)
(cuda-gdb) p 561 & 0x7f
$5 = 49
(cuda-gdb) p 461 & 0x7f
$6 = 77
所以,in_data
是一个值。我将尝试一个简单的例子,看看我是否可以重现相同的。
答案 0 :(得分:1)
由于提供的信息有限(无代码),我可能会猜测:
The GDB print command has been extended to decipher the location of any program variable and can be used to display the contents of any CUDA program variable including:
* data allocated via cudaMalloc()
* data that resides in various GPU memory regions, such as shared, local, and global memory
* special CUDA runtime variables, such as threadIdx
如果 in_data 引用特定的内存区域,则可能是您正在处理内存指针而不是实际数据。
虽然只是我的两分钱。