Question

我设法完成了函数指针的工作，现在我想动态加载这样的内核。我的代码：

CUH：

ifndef customkernel_cuh
define customkernel_cuh

extern "C" pfunctionWhere __declspec(dllexport) getHostPointer();

endif

CU：

__device__
    bool myWhere2(PapayaColumnValue *values)
{
    return ((int)values[1]) == 1 || ((int)values[1]) == 3;
}
__device__ pfunctionWhere pMyWhere2 = myWhere2;

pfunctionWhere __declspec(dllexport) getHostPointer()
{
    cudaError_t cudaStatus;
    pfunctionWhere h_pMyWhere2;
    cudaStatus = cudaMemcpyFromSymbol(&h_pMyWhere2, pMyWhere2, sizeof(pfunctionWhere));
    cudaDeviceSynchronize();
    return h_pMyWhere2;
}

main.cpp中：

HINSTANCE hGetProcIDDLL = LoadLibrary("xxx.dll");
    if (hGetProcIDDLL == NULL) {
        std::cout << "could not load the dynamic library" << std::endl;
    }
    dll_func dll_getHostPointer = (dll_func)GetProcAddress(hGetProcIDDLL, "getHostPointer");
    DWORD dw = GetLastError(); 
    if (!dll_getHostPointer) {
        std::cout << "could not locate the function" << std::endl;
    }
    pfunctionWhere h_pMyWhere2 = (*dll_getHostPointer)();

如果我调试到dll cudaStatus = cudaSuccess，但是指向function的指针为null并且它是从dll调用返回的。我的问题是：是否可以在DLL中编写内核函数然后获取指向这些内核的指针并将其传递给主程序？我需要它能够在主程序工作时更改内核。

Answer 1

您可以将内核代码编译为PTX并使用CUDA驱动程序API运行它，请参阅CUDA C Programming Guide / Driver Api / Module。

如果您使用nvcc选项而不是-ptx调用--compile，则会生成ptx文件。它没有与您的exe程序链接，您可以随时更改ptx文件。

Answer 2

整个代码没有意义。

首先，您没有检查cudaStatus。

其次你是从常量记忆中复制，但为什么呢？当然你没有更新内核中的常量内存。您可能正在寻找cudaMemcpy而不是cudaMemcpyFromSymbol

将Google设置为“固定内存”，它可能对您有用。

将内核编译为DLL并使用它

2 个答案: