Question

我有一个EC2实例。它有一个Tesla K80 GPU设备。我正在尝试运行我的OpenCL代码，它在GPU上使用繁重的计算。我遇到了一个非常奇怪的问题。

问题在于我将数据发送到设备然后对其执行计算。如果我设置线程数200，它工作正常。但如果我将它们增加到400，我的计算就会出现“nan”错误。

以下是我调用内核的方式：

kernelGA(cl::EnqueueArgs(queue[iter],
                    cl::NDRange(200 / numberOfDevices)),
                    d_value,
                    d_doubleParameters,
                    buf_half_population, and so on...)

numberOfDevices 此处为1。

以下是内核代码：

__kernel void kernelGA (__global int * Parameters,
                          // other parameters etc. 
                         ) {
    int idx = get_global_id(0);

    if (idx != 0 && idx < POP_SIZE / numberOfDevices) {
      // Some computations using threads individually.
      if ((mutationRNBPercentage[(idx * chromosomeLength) + (deviceNumber * POP_SIZE * chromosomeLength / numberOfDevices) + j]) < transitionMutationPercentage) {
         // Apply some mutation to generate the next Individual.
         temp = temp + mutationRNBTransition[(idx * 3) + (deviceNumber * POP_SIZE * 3 / numberOfDevices) + j];

    }

    for (int i = 0; i < 1000; i++) {
      // Other heavy computations using threads individually.

    }

另外，如果我将线程数从200缓慢增加到240到280等等......它给我没有“nan”错误，直到我达到440.在440线程，它给我“nan”错误。然后，当我试图追溯到400时，它仍然给我“纳”错误，它继续给出这些错误，直到我回到200.然后在200，它很好。

PS：我没有遇到任何问题，当我在MAC PRO上运行相同的程序时，我从未遇到过这样的错误。

OpenCL中增加的线程数会产生错误

0 个答案: