Question

我需要从图像的两个部分（具有相同的宽度和高度）读取像素（例如正方形（[0,0]，[300,300]）和（[400,0]，[700,300]））和为每个像素做出改变。

这是 C（伪）代码：

/**
 * @param img Input image
 * @param pos Integer position of top left corner of the second square (in this case 400)
 */
double getSum(Image& img, int pos)
{
    const int width_of_cut = 300;
    int right_bottom = pos + width;

    Rgb first, second;
    double ret_val = 0.0;

    for(int i=0; i < width_of_cut; i++)
    {
        for(int j=0; j < width_of_cut; j++)
        {
            first  = img.getPixel( i, j );
            second = img.getPixel( i + pos, j );

            ret_val += ( first.R - second.R ) +
                       ( first.G - second.G ) +
                       ( first.B - second.B );
        }
    }

    return ret_val;
}

但是我的内核（在主机代码中使用相同的参数和__global float* output设置为0.0）给了我完全不同的值：

__constant sampler_t sampler = CLK_NORMALIZED_COORDS_FALSE |
                               CLK_ADDRESS_CLAMP_TO_EDGE |
                               CLK_FILTER_NEAREST;



__kernel void getSum( __read_only image2d_t input,
                        const int x_coord,
                      __global float* output )
{
    int width  = get_image_width( input );
    int height = get_image_height( input );

    int2 pixelcoord = (int2) (get_global_id(0), get_global_id(1)); // image coordinates

    const int width_of_cut = 300; 

    const int right_bottom = x_coord + width_of_cut;

    int a,b;
    a = (int)(pixelcoord.x + x_coord);
    b = pixelcoord.y; 

    if( a < right_bottom && b < width_of_cut )
    {
        float4 first = read_imagef(input, sampler, pixelcoord);
        float4 second = read_imagef(input, sampler, (int2)(a,b));

        output[get_global_id(0)] += ((first.x - second.x) +
                                    (first.y - second.y) +
                                    (first.z - second.z));
    }

}

我是OpenCL的新手，我不知道我做错了什么。

更新（1d图像）：

我更改了内核代码。现在我在一个循环中读取1d图像，但我仍然没有得到正确的值。我不知道如何正确读取1d图像中的像素。

__kernel void getSum( __read_only image1d_t input,
                        const int x_coord,
                      __global float* output,
                        const int img_width )
{

    const int width_of_cut = 300; 

    int i = (int)(get_global_id(0));

    for(int j=0; j < width_of_cut; j++)
    {
        int f = ( img_width*i + j );
        int s = f + x_coord;

        float4 first = read_imagef( input, sampler, f ); //pixel from 1st sq.
        float4 second = read_imagef( input, sampler, s ); //pixel from 2nd sq.

        output[get_global_id(0)] += ((first.x - second.x) +
                                     (first.y - second.y) +
                                     (first.z - second.z));
    }    
}

Answer 1

比赛条件。

所有垂直工作项都访问相同的输出内存（output[get_global_id(0)] +=），而不是原子方式。因此结果可能不正确（例如，两个线程读取相同的值，向其添加内容，然后将其写回。只有一个获胜）。

如果您的设备支持它，您可以将其作为原子操作，但速度很慢。你最好还是运行一个有垂直累积这些循环的一维内核（所以，你的C例子中的j循环）。

在OpenCL中读取像素RGB值的问题

更新（1d图像）：

1 个答案: