Question

我有一个类（参见下面的示例），它充当CUDA内存结构的.NET包装器，
使用cudaMalloc（）分配并使用IntPtr类型的成员字段引用（该类使用包含各种CUDA功能的本机C DLL的DllImport。）

dispose方法检查指针是否为IntPtr.Zero，如果不是则调用cudaFree（）
成功释放内存（返回CUDA成功）
并设置指向IntPtr.Zero的指针
finalize方法调用dispose方法
问题是，如果调用finalize方法而没有事先调用dispose，然后cudaFree（）函数设置错误代码“无效设备指针”
我检查过，cudaFree（）收到的地址与cudaMalloc（）返回的地址相同，之前没有调用dispose（）。
当我向dispose（）添加一个explict调用时，成功释放了相同的地址
我发现的唯一解决方法是不从终结器调用dispose方法，但是，如果不总是调用dispose（），这可能会导致内存泄漏。

任何想法为什么会这样？ - 我遇到了与CUDA 2.2和2.3相同的问题，在Windows Vista 64bit + GeForce 8800和Windows XP 32bit + Quadro FX上的.NET 3.5 SP1下（不确定是哪个号码）。

class CudaEntity : IDisposable
{
    private IntPtr dataPointer;

    public CudaEntity()
    {
        // Calls cudaMalloc() via DllImport,
        // receives error code and throws expection if not 0
        // assigns value to this.dataPointer
    }

    public Dispose()
    {
        if (this.dataPointer != IntPtr.Zero)
        {
            // Calls cudaFree() via DllImport,
            // receives error code and throws expection if not 0

            this.dataPointer = IntPtr.Zero;
        }
    }

    ~CudaEntity()
    {
        Dispose();
    }
}

{
    // this code works
    var myEntity = new CudaEntity();
    myEntity.Dispose();
}

{
    // This code cause a "invalid device pointer"
    // error on finalizer's call to cudaFree()
    var myEntity = new CudaEntity();
}

Answer 1

问题是终结器是在GC线程上执行的，在一个线程中分配的CUDA资源不能在另一个线程中使用。来自CUDA编程指南的剪辑：

可以执行多个主机线程设备代码在同一设备上，但由设计，主机线程可以执行设备代码仅在一个设备上。作为一个结果，多个主机线程需要执行设备代码多个设备。还有，任何CUDA 通过运行时创建的资源在一个主机线程中无法使用来自另一个主机线程的运行时。

您最好的选择是使用using语句，以确保始终在“受保护”代码块的末尾调用Dispose()方法：

using(CudaEntity ent = new CudaEntity())
{

}

.NET中的CUDA全局内存释放问题

1 个答案: