我已经在Anaconda(5.3.0,python 3.6)中安装了tensorflow-gpu
,但是当我尝试运行脚本时,模型开始运行,我可以看到所有GPU都处于活动状态,但是一秒钟,它们全部降为零,内核挂起。
另外,我遇到以下错误:
2019-01-17 22:15:59.437729: E tensorflow/stream_executor/cuda/cuda_dnn.cc:353] Could not create cudnn handle: CUDNN_STATUS_NOT_INITIALIZED
2019-01-17 22:15:59.437826: E tensorflow/stream_executor/cuda/cuda_dnn.cc:361] Possibly insufficient driver version: 390.87.0
[I 22:16:02.376 LabApp] KernelRestarter: restarting kernel (1/5), keep random ports
我的系统信息:
Ubuntu 18.04.1 LST
带有虚拟环境的Anaconda(5.3.0)。通过以下版本安装的Python版本3.6.6和tensorflow:conda install -c anaconda tensorflow-gpu
我的 GPU s:
$ dpkg -l | grep cudnn
ii libcudnn7 7.1.4.18-1+cuda9.0 amd64 cuDNN runtime libraries
ii libcudnn7-dev 7.1.4.18-1+cuda9.0 amd64 cuDNN development libraries and headers
ii libcudnn7-doc 7.1.4.18-1+cuda9.0 amd64 cuDNN documents and samples
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.87 Driver Version: 390.87 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 108... Off | 00000000:04:00.0 Off | N/A |
| 29% 24C P8 14W / 250W | 2MiB / 11178MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 GeForce GTX 108... Off | 00000000:06:00.0 Off | N/A |
| 29% 25C P8 15W / 250W | 2MiB / 11178MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 GeForce GTX 108... Off | 00000000:07:00.0 Off | N/A |
| 25% 22C P8 16W / 250W | 2MiB / 11178MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 GeForce GTX 108... Off | 00000000:08:00.0 Off | N/A |
| 29% 26C P8 14W / 250W | 2MiB / 11178MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 4 GeForce GTX 108... Off | 00000000:0B:00.0 Off | N/A |
| 23% 31C P8 15W / 250W | 2MiB / 11178MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 5 GeForce GTX 108... Off | 00000000:0C:00.0 Off | N/A |
| 23% 27C P8 15W / 250W | 2MiB / 11178MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 6 GeForce GTX 108... Off | 00000000:0D:00.0 Off | N/A |
| 23% 24C P8 15W / 250W | 2MiB / 11178MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 7 GeForce GTX 108... Off | 00000000:0E:00.0 Off | N/A |
| 23% 28C P8 16W / 250W | 2MiB / 11178MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 8 GeForce GTX 108... Off | 00000000:0F:00.0 Off | N/A |
| 23% 24C P8 15W / 250W | 2MiB / 11178MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 9 GeForce GTX 108... Off | 00000000:82:00.0 Off | N/A |
| 23% 25C P8 16W / 250W | 2MiB / 11178MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
和命令行输出(我正在使用jupyter实验室)
2019-01-17 22:15:55.042286: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1490] Adding visible gpu devices: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
2019-01-17 22:15:57.568878: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-01-17 22:15:57.568932: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977] 0 1 2 3 4 5 6 7 8 9
2019-01-17 22:15:57.568938: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0: N Y Y Y Y Y Y Y Y N
2019-01-17 22:15:57.568943: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 1: Y N Y Y Y Y Y Y Y N
2019-01-17 22:15:57.568946: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 2: Y Y N Y Y Y Y Y Y N
2019-01-17 22:15:57.568950: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 3: Y Y Y N Y Y Y Y Y N
2019-01-17 22:15:57.568954: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 4: Y Y Y Y N Y Y Y Y N
2019-01-17 22:15:57.568958: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 5: Y Y Y Y Y N Y Y Y N
2019-01-17 22:15:57.568963: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 6: Y Y Y Y Y Y N Y Y N
2019-01-17 22:15:57.568982: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 7: Y Y Y Y Y Y Y N Y N
2019-01-17 22:15:57.568986: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 8: Y Y Y Y Y Y Y Y N N
2019-01-17 22:15:57.568995: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 9: N N N N N N N N N N
2019-01-17 22:15:57.571075: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10401 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:04:00.0, compute capability: 6.1)
2019-01-17 22:15:57.715851: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 10401 MB memory) -> physical GPU (device: 1, name: GeForce GTX 1080 Ti, pci bus id: 0000:06:00.0, compute capability: 6.1)
2019-01-17 22:15:57.859939: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 10401 MB memory) -> physical GPU (device: 2, name: GeForce GTX 1080 Ti, pci bus id: 0000:07:00.0, compute capability: 6.1)
2019-01-17 22:15:58.003691: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 10401 MB memory) -> physical GPU (device: 3, name: GeForce GTX 1080 Ti, pci bus id: 0000:08:00.0, compute capability: 6.1)
2019-01-17 22:15:58.147301: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:4 with 10401 MB memory) -> physical GPU (device: 4, name: GeForce GTX 1080 Ti, pci bus id: 0000:0b:00.0, compute capability: 6.1)
2019-01-17 22:15:58.290928: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:5 with 10401 MB memory) -> physical GPU (device: 5, name: GeForce GTX 1080 Ti, pci bus id: 0000:0c:00.0, compute capability: 6.1)
2019-01-17 22:15:58.434501: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:6 with 10401 MB memory) -> physical GPU (device: 6, name: GeForce GTX 1080 Ti, pci bus id: 0000:0d:00.0, compute capability: 6.1)
2019-01-17 22:15:58.578104: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:7 with 10401 MB memory) -> physical GPU (device: 7, name: GeForce GTX 1080 Ti, pci bus id: 0000:0e:00.0, compute capability: 6.1)
2019-01-17 22:15:58.721778: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:8 with 10401 MB memory) -> physical GPU (device: 8, name: GeForce GTX 1080 Ti, pci bus id: 0000:0f:00.0, compute capability: 6.1)
2019-01-17 22:15:58.865741: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:9 with 10401 MB memory) -> physical GPU (device: 9, name: GeForce GTX 1080 Ti, pci bus id: 0000:82:00.0, compute capability: 6.1)
2019-01-17 22:15:59.437729: E tensorflow/stream_executor/cuda/cuda_dnn.cc:353] Could not create cudnn handle: CUDNN_STATUS_NOT_INITIALIZED
2019-01-17 22:15:59.437826: E tensorflow/stream_executor/cuda/cuda_dnn.cc:361] Possibly insufficient driver version: 390.87.0
我正在尝试在互联网上搜索此问题,并且有很多令人困惑的解决方案。请帮忙!