无法从设备分配158.06M(165740544字节):CUDA_ERROR_OUT_OF_MEMORY

时间:2017-11-20 22:52:52

标签: centos keras nvidia

我该如何解决此错误?

[jalal@goku bin]$ source activate deep_emotion
(deep_emotion) [jalal@goku bin]$ python
Python 3.5.4 | packaged by conda-forge | (default, Nov  4 2017, 10:11:29)
[GCC 4.8.2 20140120 (Red Hat 4.8.2-15)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import keras
Using Theano backend.
>>> quit()
(deep_emotion) [jalal@goku bin]$ export KERAS_BACKEND=tensorflow
(deep_emotion) [jalal@goku bin]$ python
Python 3.5.4 | packaged by conda-forge | (default, Nov  4 2017, 10:11:29)
[GCC 4.8.2 20140120 (Red Hat 4.8.2-15)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import keras
Using TensorFlow backend.
2017-11-20 17:49:18.666294: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-11-20 17:49:18.666337: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-11-20 17:49:18.666347: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-11-20 17:49:18.666354: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-11-20 17:49:18.666363: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-11-20 17:49:19.196610: I tensorflow/core/common_runtime/gpu/gpu_device.cc:955] Found device 0 with properties:
name: GeForce GTX 1080 Ti
major: 6 minor: 1 memoryClockRate (GHz) 1.6705
pciBusID 0000:05:00.0
Total memory: 10.91GiB
Free memory: 158.06MiB
2017-11-20 17:49:19.426132: W tensorflow/stream_executor/cuda/cuda_driver.cc:523] A non-primary context 0x42e9db0 exists before initializing the StreamExecutor. We haven't verified StreamExecutor works with that.
2017-11-20 17:49:19.426768: I tensorflow/core/common_runtime/gpu/gpu_device.cc:955] Found device 1 with properties:
name: GeForce GTX 1080 Ti
major: 6 minor: 1 memoryClockRate (GHz) 1.6705
pciBusID 0000:06:00.0
Total memory: 10.91GiB
Free memory: 398.44MiB
2017-11-20 17:49:19.427277: I tensorflow/core/common_runtime/gpu/gpu_device.cc:976] DMA: 0 1
2017-11-20 17:49:19.427309: I tensorflow/core/common_runtime/gpu/gpu_device.cc:986] 0:   Y Y
2017-11-20 17:49:19.427323: I tensorflow/core/common_runtime/gpu/gpu_device.cc:986] 1:   Y Y
2017-11-20 17:49:19.427347: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:05:00.0)
2017-11-20 17:49:19.427362: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Creating TensorFlow device (/gpu:1) -> (device: 1, name: GeForce GTX 1080 Ti, pci bus id: 0000:06:00.0)
2017-11-20 17:49:19.429776: E tensorflow/stream_executor/cuda/cuda_driver.cc:924] failed to allocate 158.06M (165740544 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
>>> quit()
(deep_emotion) [jalal@goku bin]$ conda list | grep keras
keras                     2.0.9                    py35_0    conda-forge
(deep_emotion) [jalal@goku bin]$ conda list | grep tensorflow
tensorflow-gpu            1.3.0                         0
tensorflow-gpu-base       1.3.0           py35cuda8.0cudnn6.0_1
tensorflow-tensorboard    0.1.5                    py35_0

系统信息如下:

$ uname -a
Linux goku.bu.edu 3.10.0-693.5.2.el7.x86_64 #1 SMP Fri Oct 20 20:32:50 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

(deep_emotion) [jalal@goku bin]$ nvidia-smi
Mon Nov 20 17:51:50 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.81                 Driver Version: 384.81                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  Off  | 00000000:05:00.0  On |                  N/A |
|  0%   25C    P8    19W / 250W |  10862MiB / 11172MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 108...  Off  | 00000000:06:00.0 Off |                  N/A |
|  0%   36C    P8    19W / 250W |  10622MiB / 11172MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      2062      G   /usr/bin/X                                   183MiB |
|    0      2779      G   /usr/bin/gnome-shell                         176MiB |
|    0      3298      C   /cs/software/anaconda3/bin/python          10341MiB |
|    0      4350      G   ...-token=2BC290A510039A38C05EF3ECBAA5E5E5    78MiB |
|    0      5212      G   /usr/lib64/firefox/plugin-container            5MiB |
|    0     32257      G   /proc/self/exe                                64MiB |
|    1      3298      C   /cs/software/anaconda3/bin/python          10611MiB |
+-----------------------------------------------------------------------------+

1 个答案:

答案 0 :(得分:1)

感谢Robert Crovella提出的建议。重启机器解决了这个问题:

.swiper-pagination-bullet:hover{
  background: red;
}