实际输出与预期输出之间的不一致

时间:2019-05-27 08:21:05

标签: cuda offset pycuda

大家好,

此问题与pycuda memory offset not in sequence有关。回顾一下,我想使用PyCUDA生成MxN图像,其中大多数时间M不等于N。例如160x142图像。但是,这次我的预期输出与实际输出不一致。

我试图将输出image[offset] = 10.0的值设置为常量,并且可以按预期工作。但是,当我尝试按image[offset] = image_x[x]来引用值时,问题就出现了。

下面是我使用的代码。

import matplotlib.pyplot as plt
import pycuda.autoinit
import pycuda.driver as driver
from pycuda import gpuarray
from pycuda.compiler import SourceModule
import numpy as np

AREA_WIDTH = 60.0
grid_size = 0.5
BLOCK_SIZE = 32

ker = SourceModule("""
__global__ void image_ker(float *image, float *image_x, float *image_y)
{
    unsigned int x = threadIdx.x + blockIdx.x * blockDim.x;
    unsigned int y = threadIdx.y + blockIdx.y * blockDim.y;
    unsigned int offset = x + (y * blockDim.x * gridDim.x);
    float x_value = image_x[x];
    __syncthreads();

    if ((x < 160) && (y < 142))
    { 
        image[offset] = x_value;
        image_x[x] = x_value;
    }
    __syncthreads();
}
""")

if __name__ == '__main__':

    image_ker = ker.get_function("image_ker")

    minx = 5.0 - AREA_WIDTH / 2.0
    miny = 15.0 - AREA_WIDTH / 2.0
    maxx = 25.0 + AREA_WIDTH / 2.0
    maxy = 26.0 + AREA_WIDTH / 2.0


    xw = int(round((maxx - minx) / grid_size))
    yw = int(round((maxy - miny) / grid_size))
    image = np.array([[0.0 for i in range(yw)]
                     for i in range(xw)], dtype=np.float32)
    print (minx, miny, maxx, maxy, xw, yw)
    image_x  = np.array([(np.float32(i)*grid_size + minx) for i in range(xw)], dtype = np.float32)
    image_y  = np.array([(np.float32(i)*grid_size + miny) for i in range(yw)], dtype = np.float32)

    image_gpu = gpuarray.to_gpu(image)
    image_x_gpu = gpuarray.to_gpu(image_x)
    image_y_gpu = gpuarray.to_gpu(image_y)

    image_ker(image_gpu, image_x_gpu, image_y_gpu, block=(32, 32, 1),
             grid=(5, 5, 1))

    image   = image_gpu.get()
    image_x = image_x_gpu.get()
    image_y = image_y_gpu.get()
    # print(grid_xw, grid_yw)
    for ix in range(xw):
        for jy in range(yw):
            print("x, {}, image[{}][{}], {}".format(image_x[ix], ix, jy, image[ix][jy]))

我希望输出为

x, -25.0, image[0][0], -25.0
x, -25.0, image[0][1], -25.0
x, -25.0, image[0][2], -25.0
x, -25.0, image[0][3], -25.0
x, -25.0, image[0][4], -25.0
x, -25.0, image[0][5], -25.0
...
x, -4.0, image[42][77], -4.0
x, -4.0, image[42][78], -4.0
x, -4.0, image[42][79], -4.0
x, -4.0, image[42][80], -4.0
x, -4.0, image[42][81], -4.0
...
x, 54.5, image[159][138], 54.5
x, 54.5, image[159][139], 54.5
x, 54.5, image[159][140], 54.5
x, 54.5, image[159][141], 54.5

但是,我的输出是

x, -25.0, image[0][0], -25.0
x, -25.0, image[0][1], -24.5
x, -25.0, image[0][2], -24.0
x, -25.0, image[0][3], -23.5
x, -25.0, image[0][4], -23.0
x, -25.0, image[0][5], -22.5
...
x, -4.0, image[42][77], 35.5
x, -4.0, image[42][78], 36.0
x, -4.0, image[42][79], 36.5
x, -4.0, image[42][80], 37.0
x, -4.0, image[42][81], 37.5
...
x, 54.5, image[159][138], 53.0
x, 54.5, image[159][139], 53.5
x, 54.5, image[159][140], 54.0
x, 54.5, image[159][141], 54.5

__global__函数中

image[offset] = x_value;
image_x[x] = x_value;

image_x[x]返回正确的值,但是image[offset]返回某种减少的结果。

我的问题是,是否可以以某种方式返回正确的结果?还是在将image [x]引用到image [offset]时遗漏了什么?

0 个答案:

没有答案