Question

我最近在Cuda应用程序中使用多个NVidia GPU运行时遇到了麻烦。随附的代码能够在Visual Studio 2013和2015（Windows 7，Cuda 9.2，Nvidia驱动程序398.26、1xGTX1080和1xGTX960）的系统上一致地重现该问题。我正在为我的卡（5.2和6.1）构建正确的计算功能。

具体来说，在第一个GPU初始化之后，我无法在第二个GPU上获得任何功能调用。错误代码始终为“ CudaErrorMemoryAllocation”。它在Nvidia探查器以及调试和发行版中均失败。我可以按任一顺序在GPU上初始化并重现问题。

尝试扩展当前的应用程序时会出现此问题，这是一个庞大的图像处理算法管道。该管道可以有多个独立的实例，并且由于内存限制，将需要多个卡。我对此问题感到如此困惑的主要原因是，我以前曾经使用过它-我有几年前运行的Visual Profile会话，该会话显示出与我预期相同的显示方式。我知道的唯一区别是它在Cuda 8.0中。

有什么想法吗？

#include "cuda_runtime.h"
#include "cuda.h"

#include <thread>
#include <conio.h>
#include <iostream>

// Function for each thread to run
void gpuThread(int gpuIdx, bool* result)
{
    cudaSetDevice(gpuIdx); // Set gpu index

    // Create an int array on CPU
    int* hostMemory = new int[1000000];
    for (int i = 0; i < 1000000; i++)
        hostMemory[i] = i;

    // Allocate and copy to GPU
    int* gpuMemory;
    cudaMalloc(&gpuMemory, 1000000 * sizeof(int));
    cudaMemcpy(gpuMemory, hostMemory, 1000000 * sizeof(int), cudaMemcpyHostToDevice);

    // Synchronize and check errors
    cudaDeviceSynchronize();
    cudaError_t error = cudaGetLastError();
    if (error != CUDA_SUCCESS)
    {
        result[0] = false;
        return;
    }

    result[0] =  true;
}

int main()
{
    bool result1 = false;
    bool result2 = false;

    std::thread t1(gpuThread, 0, &result1);
    std::thread t2(gpuThread, 1, &result2);

    t1.join();  // Wait for both threads to complete
    t2.join();

    if (!result1 || !result2) // Verify our threads returned success
        std::cout << "Failed\n";
    else
        std::cout << "Passed\n";

    std::cout << "Press a key to exit!\n";
    _getch();

    return 0;
}

Answer 1

经过一天的卸载和重新安装程序，看来这是398.26驱动程序存在的问题。较新的版本399.07可以正常工作。

Cuda中有多个GPU-之前可以工作的代码，但现在不再可用

1 个答案: