我在 Windows 10 上使用 TensorFlow 2.3.0 和 cuda 10.1 和 CUDNN 7.6.5 已经有一段时间了。
Driver API nvidia-smi
Thu Jan 7 15:50:14 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 461.09 Driver Version: 461.09 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce GTX 106... WDDM | 00000000:01:00.0 Off | N/A |
| N/A 57C P8 8W / N/A | 92MiB / 6144MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
Runtime API nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:12:52_Pacific_Daylight_Time_2019
Cuda compilation tools, release 10.1, V10.1.243
GPU: NVIDIA GeForce GTX 1060 with Max-Q Design
我已经能够很好地训练 Tensorflow 模型并运行推理。几天前我得到了一个 “CUDA_ERROR_OUT_OF_MEMORY: out of memory”用于仅在我之前可以运行推理的模型上运行推理。运行推理的代码也没有改变。是否有其他进程正在填充 CUDA 内存?我已经尝试删除 CUDA 和 cuDNN 并重新安装。
Here are the log of the error when I run inference
我还运行了 cuda-memcheck 来检查是否有任何泄漏。
Here are the logs of cuda-memcheck --leak-check full
非常感谢任何帮助!