朱莉娅:调用使用自定义内核的ArrayFire自定义函数,ArgumentError:无法将NULL转换为字符串

时间:2019-01-23 22:50:03

标签: c++ kernel julia opencl arrayfire

我有一个自定义的内核和他各自的自定义ArrayFire函数,我想从Julia中调用此函数,但是当我使用相对较大的数组来执行此操作时,我得到:ArgumentError: cannot convert NULL to string,我知道这可能会受到限制,具体取决于类型的GPU,但是麻烦的是,例如,当我更换内核时,此限制会有所不同。

内核(Gauss_Jordan_f内核)


#pragma OPENCL EXTENSION cl_khr_fp64 : enable
__kernel void
Gauss_Jordan_f(__global double* A, __global double* B, int gsize)
{
    int tx = get_global_id(0);
    if (tx >= gsize)
    {
        return;
    }
    int Rfirst = tx * gsize;
    double diag;
    double fm;

    for (int i = 0; i < gsize; i++)
    {
        diag = A[i*gsize + i];

        if (tx != i && diag != 0)
        {
            fm = A[Rfirst + i] / diag;
            B[tx] -= fm * B[i];

            for (int j = i + 1; j < gsize; j++)
            {
                A[Rfirst + j] -= fm * A[i*gsize + j];
            }
        }

        barrier(CLK_LOCAL_MEM_FENCE);
    }

    barrier(CLK_LOCAL_MEM_FENCE);

    B[tx] /= A[Rfirst + tx];
}

函数(带有使用Gauss_Jordan_f内核的af :: array参数的函数)


void AFire::SELgj_(af::array &A, af::array &B) {

    static cl_context af_context = afcl::getContext();
    static cl_device_id af_device_id = afcl::getDeviceId();
    static cl_command_queue af_queue = afcl::getQueue();

    cl_mem * d_A = A.device<cl_mem>();
    cl_mem * d_B = B.device<cl_mem>();

    size_t order = (int)A.dims(0);

    size_t program_length = strlen(GJordan_source);
    int status = CL_SUCCESS;

    cl_program program = clCreateProgramWithSource(af_context, 1, (const char **)&GJordan_source, &program_length, &status);
    status = clBuildProgram(program, 1, &af_device_id, NULL, NULL, NULL);
    cl_kernel kernel = clCreateKernel(program, "Gauss_Jordan_f", &status);

    clSetKernelArg(kernel, 0, sizeof(cl_mem), d_A);
    clSetKernelArg(kernel, 1, sizeof(cl_mem), d_B);
    clSetKernelArg(kernel, 2, sizeof(cl_int), &order);

    size_t localWorkSize = BLOCK_SIZE * BLOCK_SIZE;
    size_t globalWorkSize = shrRoundUp(localWorkSize, order);

    clEnqueueNDRangeKernel(af_queue, kernel, 1, 0, &globalWorkSize, &localWorkSize,
        0, NULL, NULL);

    A.unlock();
    B.unlock();
}

从朱莉娅打电话


AAFArray: 2000×2000 Array{Float64,2}B的向量,与A(大小为2000)兼容,我知道系统兼容,而且知道解决方案,但是:

 ccall((:SELgj_,"path/to/dll")
         ,Cvoid,(Ref{af_array},Ref{af_array}),Af.arr,Df.arr)

但是,结局似乎很好

Df

ArgumentError: cannot convert NULL to string

Stacktrace:
 [1] unsafe_string at .\strings\string.jl:56 [inlined]
 [2] unsafe_string at .\c.jl:193 [inlined]
 [3] get_last_error at C:\Users\user\.julia\packages\ArrayFire\4SkOz\src\util.jl:299 [inlined]
 [4] _error(::UInt32) at C:\Users\user\.julia\packages\ArrayFire\4SkOz\src\util.jl:86
 [5] convert_array(::AFArray{Float64,2}) at C:\Users\user\.julia\packages\ArrayFire\4SkOz\src\wrap.jl:748
 [6] Type at C:\Users\user\.julia\packages\ArrayFire\4SkOz\src\array.jl:32 [inlined]
 [7] toa(::AFArray{Float64,2}) at C:\Users\user\.julia\packages\ArrayFire\4SkOz\src\util.jl:37
 [8] show(::IOContext{Base.GenericIOBuffer{Array{UInt8,1}}}, ::MIME{Symbol("text/plain")}, ::AFArray{Float64,2}) at C:\Users\user\.julia\packages\ArrayFire\4SkOz\src\util.jl:41
 [9] limitstringmime(::MIME{Symbol("text/plain")}, ::AFArray{Float64,2}) at C:\Users\user\.julia\packages\IJulia\GIANC\src\inline.jl:37
 [10] display_mimestring(::MIME{Symbol("text/plain")}, ::AFArray{Float64,2}) at C:\Users\user\.julia\packages\IJulia\GIANC\src\display.jl:66
 [11] display_dict(::AFArray{Float64,2}) at C:\Users\user\.julia\packages\IJulia\GIANC\src\display.jl:95
 [12] #invokelatest#1 at .\essentials.jl:697 [inlined]
 [13] invokelatest at .\essentials.jl:696 [inlined]
 [14] execute_request(::ZMQ.Socket, ::Msg) at C:\Users\user\.julia\packages\IJulia\GIANC\src\execute_request.jl:95
 [15] #invokelatest#1 at .\essentials.jl:697 [inlined]
 [16] invokelatest at .\essentials.jl:696 [inlined]
 [17] eventloop(::ZMQ.Socket) at C:\Users\user\.julia\packages\IJulia\GIANC\src\eventloop.jl:8
 [18] (::getfield(IJulia, Symbol("##15#18")))() at .\task.jl:259

,从这里不再可以创建或修改ArrayFire对象。此函数在最大大小为1000的情况下都可以正常运行,但是我还有其他函数和内核在较大的大小上也可以正常工作,我不知道为什么这样做变化。

ArrayFire详细信息

ArrayFire v3.6.1 (OpenCL, 64-bit Windows, build b443e14)
[0] NVIDIA: GeForce GT 630M, 2048 MB
-1- INTEL: Intel(R) HD Graphics 4000, 1400 MB
-2- INTEL: Intel(R) Core(TM) i7-3630QM CPU @ 2.40GHz, 6037 MB

OpenCL详细信息

ocldevicequery
[ocldevicequery] starting...

ocldevicequery Starting...

OpenCL SW Info:

 CL_PLATFORM_NAME:      NVIDIA CUDA
 CL_PLATFORM_VERSION:   OpenCL 1.2 CUDA 9.1.84
 OpenCL SDK Revision:   7027912


OpenCL Device Info:

 1 devices found supporting OpenCL:

 ---------------------------------
 Device GeForce GT 630M
 ---------------------------------
  CL_DEVICE_NAME:                       GeForce GT 630M
  CL_DEVICE_VENDOR:                     NVIDIA Corporation
  CL_DRIVER_VERSION:                    391.35
  CL_DEVICE_VERSION:                    OpenCL 1.1 CUDA
  CL_DEVICE_OPENCL_C_VERSION:           OpenCL C 1.1
  CL_DEVICE_TYPE:                       CL_DEVICE_TYPE_GPU
  CL_DEVICE_MAX_COMPUTE_UNITS:          2
  CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS:   3
  CL_DEVICE_MAX_WORK_ITEM_SIZES:        1024 / 1024 / 64
  CL_DEVICE_MAX_WORK_GROUP_SIZE:        1024
  CL_DEVICE_MAX_CLOCK_FREQUENCY:        950 MHz
  CL_DEVICE_ADDRESS_BITS:               64
  CL_DEVICE_MAX_MEM_ALLOC_SIZE:         512 MByte
  CL_DEVICE_GLOBAL_MEM_SIZE:            2048 MByte
  CL_DEVICE_ERROR_CORRECTION_SUPPORT:   no
  CL_DEVICE_LOCAL_MEM_TYPE:             local
  CL_DEVICE_LOCAL_MEM_SIZE:             48 KByte
  CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE:   64 KByte
  CL_DEVICE_QUEUE_PROPERTIES:           CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE
  CL_DEVICE_QUEUE_PROPERTIES:           CL_QUEUE_PROFILING_ENABLE
  CL_DEVICE_IMAGE_SUPPORT:              1
  CL_DEVICE_MAX_READ_IMAGE_ARGS:        128
  CL_DEVICE_MAX_WRITE_IMAGE_ARGS:       8
  CL_DEVICE_SINGLE_FP_CONFIG:           denorms INF-quietNaNs round-to-nearest round-to-zero round-to-inf fma

  CL_DEVICE_IMAGE <dim>                 2D_MAX_WIDTH     16384
                                        2D_MAX_HEIGHT    16384
                                        3D_MAX_WIDTH     2048
                                        3D_MAX_HEIGHT    2048
                                        3D_MAX_DEPTH     2048

  CL_DEVICE_EXTENSIONS:                 cl_khr_global_int32_base_atomics
                                        cl_khr_global_int32_extended_atomics
                                        cl_khr_local_int32_base_atomics
                                        cl_khr_local_int32_extended_atomics
                                        cl_khr_fp64
                                        cl_khr_byte_addressable_store
                                        cl_khr_icd
                                        cl_khr_gl_sharing
                                        cl_nv_compiler_options
                                        cl_nv_device_attribute_query
                                        cl_nv_pragma_unroll
                                        cl_nv_d3d10_sharing
                                        cl_khr_d3d10_sharing
                                        cl_nv_d3d11_sharing
                                        cl_nv_copy_opts


  CL_DEVICE_COMPUTE_CAPABILITY_NV:      2.1
  NUMBER OF MULTIPROCESSORS:            2
  NUMBER OF CUDA CORES:                 96
  CL_DEVICE_REGISTERS_PER_BLOCK_NV:     32768
  CL_DEVICE_WARP_SIZE_NV:               32
  CL_DEVICE_GPU_OVERLAP_NV:             CL_TRUE
  CL_DEVICE_KERNEL_EXEC_TIMEOUT_NV:     CL_TRUE
  CL_DEVICE_INTEGRATED_MEMORY_NV:       CL_FALSE
  CL_DEVICE_PREFERRED_VECTOR_WIDTH_<t>  CHAR 1, SHORT 1, INT 1, LONG 1, FLOAT 1, DOUBLE 1


  ---------------------------------
  2D Image Formats Supported (71)
  ---------------------------------
  #     Channel Order   Channel Type

  1     CL_R            CL_FLOAT
  2     CL_R            CL_HALF_FLOAT
  3     CL_R            CL_UNORM_INT8
  4     CL_R            CL_UNORM_INT16
  5     CL_R            CL_SNORM_INT16
  6     CL_R            CL_SIGNED_INT8
  7     CL_R            CL_SIGNED_INT16
  8     CL_R            CL_SIGNED_INT32
  9     CL_R            CL_UNSIGNED_INT8
  10    CL_R            CL_UNSIGNED_INT16
  11    CL_R            CL_UNSIGNED_INT32
  12    CL_A            CL_FLOAT
  13    CL_A            CL_HALF_FLOAT
  14    CL_A            CL_UNORM_INT8
  15    CL_A            CL_UNORM_INT16
  16    CL_A            CL_SNORM_INT16
  17    CL_A            CL_SIGNED_INT8
  18    CL_A            CL_SIGNED_INT16
  19    CL_A            CL_SIGNED_INT32
  20    CL_A            CL_UNSIGNED_INT8
  21    CL_A            CL_UNSIGNED_INT16
  22    CL_A            CL_UNSIGNED_INT32
  23    CL_RG           CL_FLOAT
  24    CL_RG           CL_HALF_FLOAT
  25    CL_RG           CL_UNORM_INT8
  26    CL_RG           CL_UNORM_INT16
  27    CL_RG           CL_SNORM_INT16
  28    CL_RG           CL_SIGNED_INT8
  29    CL_RG           CL_SIGNED_INT16
  30    CL_RG           CL_SIGNED_INT32
  31    CL_RG           CL_UNSIGNED_INT8
  32    CL_RG           CL_UNSIGNED_INT16
  33    CL_RG           CL_UNSIGNED_INT32
  34    CL_RA           CL_FLOAT
  35    CL_RA           CL_HALF_FLOAT
  36    CL_RA           CL_UNORM_INT8
  37    CL_RA           CL_UNORM_INT16
  38    CL_RA           CL_SNORM_INT16
  39    CL_RA           CL_SIGNED_INT8
  40    CL_RA           CL_SIGNED_INT16
  41    CL_RA           CL_SIGNED_INT32
  42    CL_RA           CL_UNSIGNED_INT8
  43    CL_RA           CL_UNSIGNED_INT16
  44    CL_RA           CL_UNSIGNED_INT32
  45    CL_RGBA         CL_FLOAT
  46    CL_RGBA         CL_HALF_FLOAT
  47    CL_RGBA         CL_UNORM_INT8
  48    CL_RGBA         CL_UNORM_INT16
  49    CL_RGBA         CL_SNORM_INT16
  50    CL_RGBA         CL_SIGNED_INT8
  51    CL_RGBA         CL_SIGNED_INT16
  52    CL_RGBA         CL_SIGNED_INT32
  53    CL_RGBA         CL_UNSIGNED_INT8
  54    CL_RGBA         CL_UNSIGNED_INT16
  55    CL_RGBA         CL_UNSIGNED_INT32
  56    CL_BGRA         CL_UNORM_INT8
  57    CL_BGRA         CL_SIGNED_INT8
  58    CL_BGRA         CL_UNSIGNED_INT8
  59    CL_ARGB         CL_UNORM_INT8
  60    CL_ARGB         CL_SIGNED_INT8
  61    CL_ARGB         CL_UNSIGNED_INT8
  62    CL_INTENSITY    CL_FLOAT
  63    CL_INTENSITY    CL_HALF_FLOAT
  64    CL_INTENSITY    CL_UNORM_INT8
  65    CL_INTENSITY    CL_UNORM_INT16
  66    CL_INTENSITY    CL_SNORM_INT16
  67    CL_LUMINANCE    CL_FLOAT
  68    CL_LUMINANCE    CL_HALF_FLOAT
  69    CL_LUMINANCE    CL_UNORM_INT8
  70    CL_LUMINANCE    CL_UNORM_INT16
  71    CL_LUMINANCE    CL_SNORM_INT16

  ---------------------------------
  3D Image Formats Supported (71)
  ---------------------------------
  #     Channel Order   Channel Type

  1     CL_R            CL_FLOAT
  2     CL_R            CL_HALF_FLOAT
  3     CL_R            CL_UNORM_INT8
  4     CL_R            CL_UNORM_INT16
  5     CL_R            CL_SNORM_INT16
  6     CL_R            CL_SIGNED_INT8
  7     CL_R            CL_SIGNED_INT16
  8     CL_R            CL_SIGNED_INT32
  9     CL_R            CL_UNSIGNED_INT8
  10    CL_R            CL_UNSIGNED_INT16
  11    CL_R            CL_UNSIGNED_INT32
  12    CL_A            CL_FLOAT
  13    CL_A            CL_HALF_FLOAT
  14    CL_A            CL_UNORM_INT8
  15    CL_A            CL_UNORM_INT16
  16    CL_A            CL_SNORM_INT16
  17    CL_A            CL_SIGNED_INT8
  18    CL_A            CL_SIGNED_INT16
  19    CL_A            CL_SIGNED_INT32
  20    CL_A            CL_UNSIGNED_INT8
  21    CL_A            CL_UNSIGNED_INT16
  22    CL_A            CL_UNSIGNED_INT32
  23    CL_RG           CL_FLOAT
  24    CL_RG           CL_HALF_FLOAT
  25    CL_RG           CL_UNORM_INT8
  26    CL_RG           CL_UNORM_INT16
  27    CL_RG           CL_SNORM_INT16
  28    CL_RG           CL_SIGNED_INT8
  29    CL_RG           CL_SIGNED_INT16
  30    CL_RG           CL_SIGNED_INT32
  31    CL_RG           CL_UNSIGNED_INT8
  32    CL_RG           CL_UNSIGNED_INT16
  33    CL_RG           CL_UNSIGNED_INT32
  34    CL_RA           CL_FLOAT
  35    CL_RA           CL_HALF_FLOAT
  36    CL_RA           CL_UNORM_INT8
  37    CL_RA           CL_UNORM_INT16
  38    CL_RA           CL_SNORM_INT16
  39    CL_RA           CL_SIGNED_INT8
  40    CL_RA           CL_SIGNED_INT16
  41    CL_RA           CL_SIGNED_INT32
  42    CL_RA           CL_UNSIGNED_INT8
  43    CL_RA           CL_UNSIGNED_INT16
  44    CL_RA           CL_UNSIGNED_INT32
  45    CL_RGBA         CL_FLOAT
  46    CL_RGBA         CL_HALF_FLOAT
  47    CL_RGBA         CL_UNORM_INT8
  48    CL_RGBA         CL_UNORM_INT16
  49    CL_RGBA         CL_SNORM_INT16
  50    CL_RGBA         CL_SIGNED_INT8
  51    CL_RGBA         CL_SIGNED_INT16
  52    CL_RGBA         CL_SIGNED_INT32
  53    CL_RGBA         CL_UNSIGNED_INT8
  54    CL_RGBA         CL_UNSIGNED_INT16
  55    CL_RGBA         CL_UNSIGNED_INT32
  56    CL_BGRA         CL_UNORM_INT8
  57    CL_BGRA         CL_SIGNED_INT8
  58    CL_BGRA         CL_UNSIGNED_INT8
  59    CL_ARGB         CL_UNORM_INT8
  60    CL_ARGB         CL_SIGNED_INT8
  61    CL_ARGB         CL_UNSIGNED_INT8
  62    CL_INTENSITY    CL_FLOAT
  63    CL_INTENSITY    CL_HALF_FLOAT
  64    CL_INTENSITY    CL_UNORM_INT8
  65    CL_INTENSITY    CL_UNORM_INT16
  66    CL_INTENSITY    CL_SNORM_INT16
  67    CL_LUMINANCE    CL_FLOAT
  68    CL_LUMINANCE    CL_HALF_FLOAT
  69    CL_LUMINANCE    CL_UNORM_INT8
  70    CL_LUMINANCE    CL_UNORM_INT16
  71    CL_LUMINANCE    CL_SNORM_INT16

oclDeviceQuery, Platform Name = NVIDIA CUDA, Platform Version = OpenCL 1.2 CUDA 9.1.84, SDK Revision = 7027912, NumDevs = 1, Device = GeForce GT 630M

System Info:

 Local Time/Date = 17:37:54, 1/23/2019
 CPU Arch: 9
 CPU Level: 6
 # of CPU processors: 8
 Windows Build: 9200
 Windows Ver: 6.2 (Windows Vista / Windows 7)


[ocldevicequery] test results...
PASSED

0 个答案:

没有答案