使用TensorFlow后端的Keras不使用GPU

时间:2017-09-06 16:54:52

标签: docker tensorflow keras tensorflow-gpu

我使用keras版本2.0.0和tensorflow版本0.12.1构建了docker镜像https://github.com/floydhub/dl-docker的gpu版本。然后我运行了mnist教程https://github.com/fchollet/keras/blob/master/examples/mnist_cnn.py,但意识到keras没有使用GPU。以下是我的输出

root@b79b8a57fb1f:~/sharedfolder# python test.py
Using TensorFlow backend.
Downloading data from https://s3.amazonaws.com/img-datasets/mnist.npz
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples
Train on 60000 samples, validate on 10000 samples
Epoch 1/12
2017-09-06 16:26:54.866833: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-09-06 16:26:54.866855: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-09-06 16:26:54.866863: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-09-06 16:26:54.866870: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-09-06 16:26:54.866876: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.

如果在keras使用GPU之前需要进行某些设置,有人可以告诉我吗?我对所有这些都很新,所以如果我需要提供更多信息,请告诉我。

我安装了page

中提到的先决条件

我可以启动泊坞窗图像

docker run -it -p 8888:8888 -p 6006:6006 -v /sharedfolder:/root/sharedfolder floydhub/dl-docker:cpu bash
  • 仅限GPU版本:直接在Nvidia上安装Nvidia驱动程序或按照here说明操作。请注意,您不必安装CUDA或cuDNN。这些包含在Docker容器中。

我能够执行最后一步

cv@cv-P15SM:~$ cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module  375.66  Mon May  1 15:29:16 PDT 2017
GCC version:  gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4)
  • 仅限GPU版本:按照此处的说明安装nvidia-docker:https://github.com/NVIDIA/nvidia-docker。这将安装docker CLI的替代品。它负责在Docker容器中设置Nvidia主机驱动程序环境以及其他一些东西。

我可以执行步骤here

# Test nvidia-smi
cv@cv-P15SM:~$ nvidia-docker run --rm nvidia/cuda nvidia-smi

Thu Sep  7 00:33:06 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.66                 Driver Version: 375.66                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 780M    Off  | 0000:01:00.0     N/A |                  N/A |
| N/A   55C    P0    N/A /  N/A |    310MiB /  4036MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0                  Not Supported                                         |
+-----------------------------------------------------------------------------+

我也可以运行nvidia-docker命令来启动支持gpu的图像。

我尝试了什么

我在下面尝试了以下建议

  1. 检查您是否已完成本教程的第9步(https://github.com/ignaciorlando/skinner/wiki/Keras-and-TensorFlow-installation)。注意:您的文件路径在docker镜像中可能完全不同,您必须以某种方式找到它们。
  2. 我将建议的行添加到我的bashrc并验证了bashrc文件已更新。

    echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-8.0/lib64:/usr/local/cuda-8.0/extras/CUPTI/lib64' >> ~/.bashrc
    echo 'export CUDA_HOME=/usr/local/cuda-8.0' >> ~/.bashrc
    
    1. 在我的python文件中导入以下命令

      import os os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID" # see issue #152 os.environ["CUDA_VISIBLE_DEVICES"]="0"

    2. 不幸的是,这两个步骤单独或一起完成并没有解决问题。 Keras仍在以tensorflow的CPU版本作为后端运行。但是,我可能已经找到了可能的问题。我通过以下命令检查了我的tensorflow的版本,找到了其中两个。

      这是CPU版本

      root@08b5fff06800:~# pip show tensorflow
      Name: tensorflow
      Version: 1.3.0
      Summary: TensorFlow helps the tensors flow
      Home-page: http://tensorflow.org/
      Author: Google Inc.
      Author-email: opensource@google.com
      License: Apache 2.0
      Location: /usr/local/lib/python2.7/dist-packages
      Requires: tensorflow-tensorboard, six, protobuf, mock, numpy, backports.weakref, wheel
      

      这是GPU版本

      root@08b5fff06800:~# pip show tensorflow-gpu
      Name: tensorflow-gpu
      Version: 0.12.1
      Summary: TensorFlow helps the tensors flow
      Home-page: http://tensorflow.org/
      Author: Google Inc.
      Author-email: opensource@google.com
      License: Apache 2.0
      Location: /usr/local/lib/python2.7/dist-packages
      Requires: mock, numpy, protobuf, wheel, six
      

      有趣的是,输出显示keras使用的是tensorflow版本1.3.0,这是CPU版本而不是0.12.1,GPU版本

      import keras
      from keras.datasets import mnist
      from keras.models import Sequential
      from keras.layers import Dense, Dropout, Flatten
      from keras.layers import Conv2D, MaxPooling2D
      from keras import backend as K
      
      import tensorflow as tf
      print('Tensorflow: ', tf.__version__)
      

      输出

      root@08b5fff06800:~/sharedfolder# python test.py
      Using TensorFlow backend.
      Tensorflow:  1.3.0
      

      我想现在我需要弄清楚如何让keras使用tensorflow的gpu版本。

4 个答案:

答案 0 :(得分:18)

从不一个好主意同时安装tensorflowtensorflow-gpu包(偶然发生在我身上的一次,Keras正在使用CPU版本)。

  

我想现在我需要弄清楚如何让keras使用tensorflow的gpu版本。

您只需从系统中删除这两个软件包,然后重新安装tensorflow-gpu [评论后更新]:

pip uninstall tensorflow tensorflow-gpu
pip install tensorflow-gpu

此外,令人费解的是您似乎使用floydhub/dl-docker:cpu容器,而根据说明您应该使用floydhub/dl-docker:gpu容器......

答案 1 :(得分:2)

将来,您可以尝试使用虚拟环境来分离张量流CPU和GPU,例如:

conda create --name tensorflow python=3.5
activate tensorflow
pip install tensorflow

conda create --name tensorflow-gpu python=3.5
activate tensorflow-gpu
pip install tensorflow-gpu

答案 2 :(得分:2)

我遇到过类似的问题-keras没有使用我的GPU。我按照说明将tensorflow-gpu安装到了conda中,但是在安装keras之后,它根本没有将GPU列为可用设备。我已经意识到安装keras会添加tensorflow软件包!所以我同时拥有tensorflow和tensorflow-gpu软件包。我发现有可用的keras-gpu软件包。完全卸载keras,tensorflow,tensorflow-gpu并安装tensorflow-gpu,keras-gpu后,问题得以解决。

答案 3 :(得分:-1)

这对我有用: 安装 tensorflow v2.2.0 点安装张量流==2.2.0 同时删除 tensorflow-gpu(如果存在)