CUDA中的Shift运算符

时间:2014-01-23 19:09:57

标签: cuda

鉴于此CUDA代码,我正在尝试执行位移操作,并且这些值的返回值为零。这不应该发生。有谁知道如何解决这个问题?我错过了CUDA标题包含吗?

代码

      __device__ unsigned int FI( unsigned int in_data, unsigned int subkey, 
     unsigned int *KLi1, unsigned int *KLi2, unsigned int *KOi1, unsigned int *KOi2,
     unsigned int *KOi3, unsigned int *KIi1, unsigned int *KIi2, unsigned int *KIi3) {

      unsigned int nine, seven;

      unsigned int S7[128]  = {};

      unsigned int S9[512] = {};

      nine = (in_data>>7);
      seven = (in_data&0x7F);

      /* Now run the various operations */
      nine = (unsigned int)(S9[nine] ^ seven);
      seven = (unsigned int)(S7[seven] ^ (nine & 0x7F));
      seven ^= (subkey>>9);
      nine ^= (subkey&0x1FF);
      nine = (unsigned int)(S9[nine] ^ seven);
      seven = (unsigned int)(S7[seven] ^ (nine & 0x7F));
      in_data = (unsigned int)((seven<<9) + nine);
      return( in_data );
      }

断点分析

这是一个代码片段的示例,它将unsigned int 7个位置向右移动。当我在指令处cuda -gdb我的exec和断点时,我发现移位后的值在不应该的时候保持为零。当我通常在cuda-gdb命令提示符中执行相同的操作时,我得到一个非零值。有任何建议或提示吗?

根据in_data的值,变量9和7应该是非空的。

    nine = (in_data>>7);
    seven = (in_data&0x7F);

    [Switching focus to CUDA kernel 0, grid 1, block (0,0,0), thread (1,0,0), device 0, sm 0, warp 0, lane 1]
    Breakpoint 1, FI (KLi1=0x3fffae0, KLi2=0x3fffb00, KOi1=0x3fffb20, KOi2=0x3fffb40, KOi3=0x3fffb60, 
    KIi1=0x3fffb80, KIi2=0x3fffba0, KIi3=0x3fffbc0, in_data=461, subkey=0) at kasumiOp.cu:61
     61   nine = (in_data>>7);
     (cuda-gdb) p in_data
     $1 = 461
     (cuda-gdb) step
     62   seven = (in_data&0x7F);
     (cuda-gdb) p nine
     $2 = 0
     (cuda-gdb) step
     65   nine = (unsigned int)(S9[nine] ^ seven);
     (cuda-gdb) p seven
     $3 = 0
     (cuda-gdb) p 461 >> 7
     $4 = 3
     (cuda-gdb) cuda thread
     thread (1,0,0)
     (cuda-gdb) p 561 & 0x7f
     $5 = 49
     (cuda-gdb) p 461 & 0x7f
     $6 = 77

所以,in_data是一个值。我将尝试一个简单的例子,看看我是否可以重现相同的。

1 个答案:

答案 0 :(得分:1)

由于提供的信息有限(无代码),我可能会猜测:

CUDA-GDB documentation州:

The GDB print command has been extended to decipher the location of any program variable and can be used to display the contents of any CUDA program variable including:
* data allocated via cudaMalloc()
* data that resides in various GPU memory regions, such as shared, local, and global memory
* special CUDA runtime variables, such as threadIdx

如果 in_data 引用特定的内存区域,则可能是您正在处理内存指针而不是实际数据。

虽然只是我的两分钱。