修改CUDA示例会导致对全局内存的不安全访问

时间:2013-05-09 09:05:25

标签: cuda

我修改了Sobel Filter样本以实现Canny过滤器的非最大抑制。但是,以下代码生成异常:

unsigned char pix00 = pCannyOriginal[ i-1 + (blockIdx.x-1) * blockDim.x];
unsigned char pix01 = pCannyOriginal[ i+0 + (blockIdx.x-1) * blockDim.x];
unsigned char pix02 = pCannyOriginal[ i+1 + (blockIdx.x-1) * blockDim.x];
unsigned char pix10 = pCannyOriginal[ i-1 + (blockIdx.x+0) * blockDim.x];
unsigned char pix11 = pCannyOriginal[ i+0 + (blockIdx.x+0) * blockDim.x];
unsigned char pix12 = pCannyOriginal[ i+1 + (blockIdx.x+0) * blockDim.x];
unsigned char pix20 = pCannyOriginal[ i-1 + (blockIdx.x+1) * blockDim.x];
unsigned char pix21 = pCannyOriginal[ i+0 + (blockIdx.x+1) * blockDim.x];
unsigned char pix22 = pCannyOriginal[ i+1 + (blockIdx.x+1) * blockDim.x];

我知道这会导致对内存的无效访问,但原始纹理上的同一组分配不会生成一个。 那么,tex2D功能是否具有无效内存访问机制? 我该如何解决这个问题?

另外作为一个注释,使用原始的lena.pgm不会产生任何异常,但用其他东西替换它。原来的lena.pgm是否包含一些额外的行和列,或者我在这里遗漏了什么?

1 个答案:

答案 0 :(得分:1)

original code依赖于2D纹理:

    unsigned char pix00 = tex2D(tex, (float) i-1, (float) blockIdx.x-1);
    unsigned char pix01 = tex2D(tex, (float) i+0, (float) blockIdx.x-1);
    unsigned char pix02 = tex2D(tex, (float) i+1, (float) blockIdx.x-1);
    unsigned char pix10 = tex2D(tex, (float) i-1, (float) blockIdx.x+0);
    unsigned char pix11 = tex2D(tex, (float) i+0, (float) blockIdx.x+0);
    unsigned char pix12 = tex2D(tex, (float) i+1, (float) blockIdx.x+0);
    unsigned char pix20 = tex2D(tex, (float) i-1, (float) blockIdx.x+1);
    unsigned char pix21 = tex2D(tex, (float) i+0, (float) blockIdx.x+1);
    unsigned char pix22 = tex2D(tex, (float) i+1, (float) blockIdx.x+1);

然而,纹理不是简单的数组:它们支持插值(参见this post)和其他一些选项,例如cudaAddressModeClamp(越界访问---如果是负指数,则为> 0,如果是索引太大了,参见this other post)。

在您的代码中,如果您对简单的线性化数组使用相同的(x,y) ID,则最终会访问错误的地址(x < 0和/或y < 0),除非您采取适当的预防措施。