Question

我很难弄清楚以下内核中的问题：

__kernel void test(global unsigned char *word, int len) {

    uint chunks[16];

    // Init the array with zeros

    for (int i = 0;i<16; i++) {
        printf("%d\n",i);
        chunks[i] = 0;
    }

    // padding

    for (uint i = 0;i<len;i+=4) {

        chunks[i/4] = 0;
        for(uint j = 0;j<4 && i+j < len;j++) {

            uint c = word[i+j]<<(8*(3-j));
            chunks[i/4] |= c;

        }

    }

    // bit-wise print of the first element of the array, just as a test

     for (int j = 0;j<32;j++) {
        printf("%d ",(chunks[0]>>(31-j))&1);
    }
}

内核用于执行简单的填充，在char中存储4 uint个。目前这仅仅是一个测试，所以我只创建一个工作组，内核实际上只执行一次。

问题是该程序收到SIGABRT（当时没有真正理解）。
试图追踪问题我注意到，如果我注释掉＆＃34; init＆＃34;那么代码似乎正常工作。部分，留下填充和按位打印，相反，如果我评论填充，留下初始化和打印，它可以工作。
此外，如果我删除填充并使用整个数组的按位打印替换SIGABRT的第一个元素的按位打印，我会继续获得chunks

for (int i = 0;i<16; i++) {
    for (int j = 0;j<32;j++) {
        printf("%d ",(chunks[i]>>(31-j))&1);
    }
    printf("\n");
}

如果我使用CPU而不是GPU（在下面的主机代码中直接使用CL_DEVICE_TYPE_CPU）启动它，代码就可以正常工作。

我担心我所做的事情在概念上有问题，但在specs或在网上搜索时没有找到任何有用的提示。

我正在使用MacOS 10.8.5。这是我的主持人代码：

int main(int argc, const char * argv[]) {

    char name[128];

    dispatch_queue_t queue = gcl_create_dispatch_queue(CL_DEVICE_TYPE_GPU, NULL);

    if (queue == NULL) {
        queue = gcl_create_dispatch_queue(CL_DEVICE_TYPE_CPU, NULL);
    }

    // print name of the device 
    cl_device_id gpu = gcl_get_device_id_with_dispatch_queue(queue);
    clGetDeviceInfo(gpu, CL_DEVICE_NAME, 128, name, NULL);
    printf("Created a dispatch queue using the %s\n", name);


    unsigned char *word = (unsigned char*) malloc(7*sizeof(unsigned char));
    sprintf(word, "string");

    void* word_mem  = gcl_malloc(7 * sizeof(char), word,
                           CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR);

    dispatch_sync(queue, ^{

        cl_ndrange range = {                                             
            1,
            {0, 0, 0},
            {1, 0, 0},
            {0, 0, 0}
        };

        test_kernel(&range,(unsigned char*)word_mem, 6);

    });

    gcl_free(word_mem);    
    free(word);

    dispatch_release(queue);
    return 0;
}

Answer 1

您确定可以在OpenCL内核中使用'printf'作为Nvidia GPU吗？

我怀疑printf仅适用于CL_DEVICE_TYPE_CPU或CL_DEVICE_TYPE_GPU + AMD。

祝你好运，莫伊塞斯

OpenCL中的uint数组和for循环

1 个答案: