嗨,我知道这是一个经常遇到的问题,大多数解决方案是使用以下代码来允许GPU的增长:
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.Session(config = config)
我将此代码包含在train.py文件中,并且批处理大小仅为1,但仍然出现相同的错误。我运行了nvidia-smi
,这是我的输出:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 442.19 Driver Version: 442.19 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1060 WDDM | 00000000:01:00.0 Off | N/A |
| N/A 82C P2 65W / N/A | 5069MiB / 6144MiB | 99% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 796 C ...cal\Programs\Python\Python37\python.exe N/A |
关于一遍又一遍导致OOM错误的任何帮助。训练仍在进行,但速度很慢。