OpenCL总线错误

时间:2011-05-12 12:26:23

标签: opencl runtime-error

我的OpenCL代码有问题。我在CPU(core 2 duo)Mac OS X 10.6.7上编译并运行它。这是代码:

#define BUFSIZE (524288)    // 512 KB
#define BLOCKBYTES (32)    // 32 B
__kernel void test(__global unsigned char *in,
                   __global unsigned char *out,
                   unsigned int srcOffset,
                   unsigned int dstOffset) {
    int grId = get_group_id(0);
    unsigned char msg[BUFSIZE];
    srcOffset = grId * BUFSIZE;
    dstOffset = grId * BLOCKBYTES;

    // Copy from global to private memory
    size_t i;
    for (i = 0; i < BUFSIZE; i++)
        msg[i] = in[ srcOffset + i ];

    // Make some computation here, not complicated logic    

    // Copy from private to global memory
    for (i = 0; i < BLOCKBYTES; i++)
        out[ dstOffset + i ] = msg[i];
}

代码给了我一个运行时错误“总线错误”。当我在循环中帮助printf从全局内存复制到私有内存然后看到那里发生错误,每次在i的不同迭代中。当我将 BUFSIZE 的大小减小到262144(256 KB)时,代码运行正常。我试图在一个工作组中只有一个工作项。 * in指向具有数千KB数据的内存区域。我怀疑要限制私有内存,但是在内存分配时出错,而不是在复制时。

这是我的OpenCL设备查询:

-

--------------------------------
 Device Intel(R) Core(TM)2 Duo CPU     P7550  @ 2.26GHz
 ---------------------------------
  CL_DEVICE_NAME:           Intel(R) Core(TM)2 Duo CPU     P7550  @ 2.26GHz
  CL_DEVICE_VENDOR:             Intel
  CL_DRIVER_VERSION:            1.0
  CL_DEVICE_VERSION:            OpenCL 1.0 
  CL_DEVICE_TYPE:           CL_DEVICE_TYPE_CPU
  CL_DEVICE_MAX_COMPUTE_UNITS:      2
  CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS:   3
  CL_DEVICE_MAX_WORK_ITEM_SIZES:    1 / 1 / 1 
  CL_DEVICE_MAX_WORK_GROUP_SIZE:    1
  CL_DEVICE_MAX_CLOCK_FREQUENCY:    2260 MHz
  CL_DEVICE_ADDRESS_BITS:       32
  CL_DEVICE_MAX_MEM_ALLOC_SIZE:     1024 MByte
  CL_DEVICE_GLOBAL_MEM_SIZE:        1535 MByte
  CL_DEVICE_ERROR_CORRECTION_SUPPORT:   no
  CL_DEVICE_LOCAL_MEM_TYPE:     global
  CL_DEVICE_LOCAL_MEM_SIZE:     16 KByte
  CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE:   64 KByte
  CL_DEVICE_QUEUE_PROPERTIES:       CL_QUEUE_PROFILING_ENABLE
  CL_DEVICE_IMAGE_SUPPORT:      1
  CL_DEVICE_MAX_READ_IMAGE_ARGS:    128
  CL_DEVICE_MAX_WRITE_IMAGE_ARGS:   8
  CL_DEVICE_SINGLE_FP_CONFIG:       denorms INF-quietNaNs round-to-nearest 

  CL_DEVICE_IMAGE <dim>         2D_MAX_WIDTH     8192
                    2D_MAX_HEIGHT    8192
                    3D_MAX_WIDTH     2048
                    3D_MAX_HEIGHT    2048
                    3D_MAX_DEPTH     2048

  CL_DEVICE_EXTENSIONS:         cl_khr_fp64
                    cl_khr_global_int32_base_atomics
                    cl_khr_global_int32_extended_atomics
                    cl_khr_local_int32_base_atomics
                    cl_khr_local_int32_extended_atomics
                    cl_khr_byte_addressable_store
                    cl_APPLE_gl_sharing
                    cl_APPLE_SetMemObjectDestructor
                    cl_APPLE_ContextLoggingFunctions

  CL_DEVICE_PREFERRED_VECTOR_WIDTH_<t>  CHAR 16, SHORT 8, INT 4, LONG 2, FLOAT 4, DOUBLE 2

1 个答案:

答案 0 :(得分:1)

使用大小为512kB的变量msg。此变量应位于私有内存中。私人记忆并不是那么大。据我所知,这应该不起作用。

为什么您有参数srcOffsetdstOffset?你不要使用它们。

我没有看到更多问题。尝试分配本地内存。你有没有优化运行的代码版本?一个只在全局内存中计算的版本?