为什么dask在CUDA函数上不返回任何内容?

时间:2019-05-20 18:56:29

标签: python dask dask-distributed

我试图将dask放在我的cuda函数之上,但是当dask返回时,我得到一个NoneType对象。

from numba import cuda
import numpy as np
from dask.distributed import Client, LocalCluster


@cuda.jit()
def addingNumbersCUDA (big_array, big_array2, save_array):
    i = cuda.grid(1)
    if i < big_array.shape[0]:
        for j in range (big_array.shape[1]):
            save_array[i][j] = big_array[i][j] * big_array2[i][j]


if __name__ == "__main__":
    cluster = LocalCluster()
    client = Client(cluster)


    big_array = np.random.random_sample((100, 3000))
    big_array2  = np.random.random_sample((100, 3000))
    save_array = np.zeros(shape=(100, 3000))


    arraysize = 100
    threadsperblock = 64
    blockspergrid = (arraysize + (threadsperblock - 1))

    x = client.submit(addingNumbersCUDA[blockspergrid, threadsperblock], big_array, big_array2, save_array)


    y = client.gather(x)
    print(y)

我知道您实际上并没有返回cuda函数,并且结果被推回了您调用的数组。这就是为什么我得到noneType的原因,或者是因为我使用了dask错误的原因为了CUDA?

1 个答案:

答案 0 :(得分:0)

正如这个问题所指出的:How to use Dask to run python code on the GPU?,作者:Matthew Rocklin,dask无法处理就地操作。为了解决这个问题,最好添加一个附加功能来处理gpu代码。

from numba import cuda
import numpy as np
from dask.distributed import Client, LocalCluster


@cuda.jit()
def addingNumbersCUDA (big_array, big_array2, save_array):
    i = cuda.grid(1)
    if i < big_array.shape[0]:
        for j in range (big_array.shape[1]):
            save_array[i][j] = big_array[i][j] * big_array2[i][j]

def toCUDA (big_array, big_array2, save_array):
    arraysize = 100
    threadsperblock = 64
    blockspergrid = (arraysize + (threadsperblock - 1))

    d_big_array = cuda.to_device(big_array)
    d_big_array2 = cuda.to_device(big_array2)
    d_save_array = cuda.to_device(save_array)

    addingNumbersCUDA[blockspergrid, threadsperblock](d_big_array, d_big_array2, d_save_array)

    save_array = d_save_array.copy_to_host()
    return save_array

if __name__ == "__main__":
    cluster = LocalCluster()
    client = Client(cluster)

    big_array = np.random.random_sample((100, 3000))
    big_array2  = np.random.random_sample((100, 3000))
    save_array = np.zeros(shape=(100, 3000))

    x = client.submit(toCUDA, big_array, big_array2, save_array)


    y = client.gather(x)
    print(y)