Keras(Anaconda)导致GPU计算失败-驱动程序版本可能不足:390.87.0

时间:2019-01-17 22:40:27

标签: python tensorflow keras gpu

我已经在Anaconda(5.3.0,python 3.6)中安装了tensorflow-gpu,但是当我尝试运行脚本时,模型开始运行,我可以看到所有GPU都处于活动状态,但是一秒钟,它们全部降为零,内核挂起。

另外,我遇​​到以下错误:

2019-01-17 22:15:59.437729: E tensorflow/stream_executor/cuda/cuda_dnn.cc:353] Could not create cudnn handle: CUDNN_STATUS_NOT_INITIALIZED
2019-01-17 22:15:59.437826: E tensorflow/stream_executor/cuda/cuda_dnn.cc:361] Possibly insufficient driver version: 390.87.0
[I 22:16:02.376 LabApp] KernelRestarter: restarting kernel (1/5), keep random ports

我的系统信息:

Ubuntu 18.04.1 LST 带有虚拟环境的Anaconda(5.3.0)。通过以下版本安装的Python版本3.6.6和tensorflow:conda install -c anaconda tensorflow-gpu

我的 GPU s:

$ dpkg -l | grep cudnn
ii  libcudnn7                                  7.1.4.18-1+cuda9.0                           amd64        cuDNN runtime libraries
ii  libcudnn7-dev                              7.1.4.18-1+cuda9.0                           amd64        cuDNN development libraries and headers
ii  libcudnn7-doc                              7.1.4.18-1+cuda9.0                           amd64        cuDNN documents and samples


+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.87                 Driver Version: 390.87                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  Off  | 00000000:04:00.0 Off |                  N/A |
| 29%   24C    P8    14W / 250W |      2MiB / 11178MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 108...  Off  | 00000000:06:00.0 Off |                  N/A |
| 29%   25C    P8    15W / 250W |      2MiB / 11178MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce GTX 108...  Off  | 00000000:07:00.0 Off |                  N/A |
| 25%   22C    P8    16W / 250W |      2MiB / 11178MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  GeForce GTX 108...  Off  | 00000000:08:00.0 Off |                  N/A |
| 29%   26C    P8    14W / 250W |      2MiB / 11178MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   4  GeForce GTX 108...  Off  | 00000000:0B:00.0 Off |                  N/A |
| 23%   31C    P8    15W / 250W |      2MiB / 11178MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   5  GeForce GTX 108...  Off  | 00000000:0C:00.0 Off |                  N/A |
| 23%   27C    P8    15W / 250W |      2MiB / 11178MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   6  GeForce GTX 108...  Off  | 00000000:0D:00.0 Off |                  N/A |
| 23%   24C    P8    15W / 250W |      2MiB / 11178MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   7  GeForce GTX 108...  Off  | 00000000:0E:00.0 Off |                  N/A |
| 23%   28C    P8    16W / 250W |      2MiB / 11178MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   8  GeForce GTX 108...  Off  | 00000000:0F:00.0 Off |                  N/A |
| 23%   24C    P8    15W / 250W |      2MiB / 11178MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   9  GeForce GTX 108...  Off  | 00000000:82:00.0 Off |                  N/A |
| 23%   25C    P8    16W / 250W |      2MiB / 11178MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

和命令行输出(我正在使用jupyter实验室)

2019-01-17 22:15:55.042286: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1490] Adding visible gpu devices: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
2019-01-17 22:15:57.568878: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-01-17 22:15:57.568932: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977]      0 1 2 3 4 5 6 7 8 9
2019-01-17 22:15:57.568938: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0:   N Y Y Y Y Y Y Y Y N
2019-01-17 22:15:57.568943: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 1:   Y N Y Y Y Y Y Y Y N
2019-01-17 22:15:57.568946: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 2:   Y Y N Y Y Y Y Y Y N
2019-01-17 22:15:57.568950: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 3:   Y Y Y N Y Y Y Y Y N
2019-01-17 22:15:57.568954: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 4:   Y Y Y Y N Y Y Y Y N
2019-01-17 22:15:57.568958: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 5:   Y Y Y Y Y N Y Y Y N
2019-01-17 22:15:57.568963: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 6:   Y Y Y Y Y Y N Y Y N
2019-01-17 22:15:57.568982: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 7:   Y Y Y Y Y Y Y N Y N
2019-01-17 22:15:57.568986: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 8:   Y Y Y Y Y Y Y Y N N
2019-01-17 22:15:57.568995: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 9:   N N N N N N N N N N
2019-01-17 22:15:57.571075: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10401 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:04:00.0, compute capability: 6.1)
2019-01-17 22:15:57.715851: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 10401 MB memory) -> physical GPU (device: 1, name: GeForce GTX 1080 Ti, pci bus id: 0000:06:00.0, compute capability: 6.1)
2019-01-17 22:15:57.859939: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 10401 MB memory) -> physical GPU (device: 2, name: GeForce GTX 1080 Ti, pci bus id: 0000:07:00.0, compute capability: 6.1)
2019-01-17 22:15:58.003691: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 10401 MB memory) -> physical GPU (device: 3, name: GeForce GTX 1080 Ti, pci bus id: 0000:08:00.0, compute capability: 6.1)
2019-01-17 22:15:58.147301: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:4 with 10401 MB memory) -> physical GPU (device: 4, name: GeForce GTX 1080 Ti, pci bus id: 0000:0b:00.0, compute capability: 6.1)
2019-01-17 22:15:58.290928: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:5 with 10401 MB memory) -> physical GPU (device: 5, name: GeForce GTX 1080 Ti, pci bus id: 0000:0c:00.0, compute capability: 6.1)
2019-01-17 22:15:58.434501: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:6 with 10401 MB memory) -> physical GPU (device: 6, name: GeForce GTX 1080 Ti, pci bus id: 0000:0d:00.0, compute capability: 6.1)
2019-01-17 22:15:58.578104: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:7 with 10401 MB memory) -> physical GPU (device: 7, name: GeForce GTX 1080 Ti, pci bus id: 0000:0e:00.0, compute capability: 6.1)
2019-01-17 22:15:58.721778: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:8 with 10401 MB memory) -> physical GPU (device: 8, name: GeForce GTX 1080 Ti, pci bus id: 0000:0f:00.0, compute capability: 6.1)
2019-01-17 22:15:58.865741: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:9 with 10401 MB memory) -> physical GPU (device: 9, name: GeForce GTX 1080 Ti, pci bus id: 0000:82:00.0, compute capability: 6.1)
2019-01-17 22:15:59.437729: E tensorflow/stream_executor/cuda/cuda_dnn.cc:353] Could not create cudnn handle: CUDNN_STATUS_NOT_INITIALIZED
2019-01-17 22:15:59.437826: E tensorflow/stream_executor/cuda/cuda_dnn.cc:361] Possibly insufficient driver version: 390.87.0

我正在尝试在互联网上搜索此问题,并且有很多令人困惑的解决方案。请帮忙!

0 个答案:

没有答案