TensorFlow:未使用GPU,已安装CUDA,但似乎未为CUDA启用GPU

时间:2019-07-25 00:55:27

标签: python tensorflow

我有一个利用利用Keras和TensorFlow构建的对象检测模型的应用程序。我已经安装了tensorflow-gpu。当应用程序运行时,我看不到我的GPU被按预期使用。因此,基于this post/answer,我试图验证TensorFlow可以使用我的GPU,但是它给出了一个错误,指示未为我的GPU启用CUDA(即The requested device appears to be a GPU, but CUDA is not enabled.):

$ python 
Python 3.7.4 (default, Jul  9 2019, 15:11:16) 
[GCC 7.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> with tf.device('/gpu:0'):
...     a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
...     b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
...     c = tf.matmul(a, b)
... 
>>> with tf.Session() as sess:
...     print(sess.run(c))
... 
2019-07-24 20:31:37.175391: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-07-24 20:31:37.204704: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2208000000 Hz
2019-07-24 20:31:37.207126: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x560dd3d22490 executing computations on platform Host. Devices:
2019-07-24 20:31:37.207196: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): <undefined>, <undefined>
Traceback (most recent call last):
  File "/home/james/.virtualenvs/deep_monitor_venv/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1356, in _do_call
    return fn(*args)
  File "/home/james/.virtualenvs/deep_monitor_venv/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1339, in _run_fn
    self._extend_graph()
  File "/home/james/.virtualenvs/deep_monitor_venv/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1374, in _extend_graph
    tf_session.ExtendSession(self._session)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation MatMul: {{node MatMul}}was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:XLA_CPU:0 ]. Make sure the device specification refers to a valid device. The requested device appears to be a GPU, but CUDA is not enabled.
     [[MatMul]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/home/james/.virtualenvs/deep_monitor_venv/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 950, in run
    run_metadata_ptr)
  File "/home/james/.virtualenvs/deep_monitor_venv/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1173, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/james/.virtualenvs/deep_monitor_venv/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1350, in _do_run
    run_metadata)
  File "/home/james/.virtualenvs/deep_monitor_venv/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1370, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation MatMul: node MatMul (defined at <stdin>:4) was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:XLA_CPU:0 ]. Make sure the device specification refers to a valid device. The requested device appears to be a GPU, but CUDA is not enabled.
     [[MatMul]]

Errors may have originated from an input operation.
Input Source operations connected to node MatMul:
 a (defined at <stdin>:2)   
 b (defined at <stdin>:3)
>>> 

不过,据我所知,CUDA工具包已正确安装并且在GPU上启用了CUDA:

$ ll /usr/local/cuda
lrwxrwxrwx 1 root root 9 Jun 12 15:59 /usr/local/cuda -> cuda-10.1/

$ nvidia-smi
Wed Jul 24 16:14:15 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.67       Driver Version: 418.67       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 105...  On   | 00000000:01:00.0 Off |                  N/A |
| N/A   43C    P0    N/A /  N/A |    830MiB /  4042MiB |      6%      Default |
+-------------------------------+----------------------+----------------------+

我的虚拟环境的已安装软件包:

$ pip list
Package              Version 
-------------------- --------
absl-py              0.7.1   
arrow                0.14.2  
astor                0.8.0   
Cython               0.29.12 
gast                 0.2.2   
google-pasta         0.1.7   
grpcio               1.22.0  
h5py                 2.9.0   
imutils              0.5.2   
joblib               0.13.2  
Keras                2.2.4   
Keras-Applications   1.0.8   
Keras-Preprocessing  1.1.0   
keras-resnet         0.2.0   
keras-retinanet      0.5.1   
Markdown             3.1.1   
numpy                1.16.4  
opencv-python        4.1.0.25
Pillow               6.1.0   
pip                  19.2.1  
progressbar2         3.42.0  
protobuf             3.9.0   
python-dateutil      2.8.0   
python-utils         2.3.0   
PyYAML               5.1.1   
scikit-learn         0.21.2  
scipy                1.3.0   
setuptools           41.0.1  
six                  1.12.0  
SQLAlchemy           1.3.6   
SQLAlchemy-Utils     0.34.1  
tensorboard          1.14.0  
tensorflow           1.14.0  
tensorflow-estimator 1.14.0  
tensorflow-gpu       1.14.0  
termcolor            1.1.0   
Werkzeug             0.15.5  
wget                 3.2     
wheel                0.33.4  
wrapt                1.11.2

我的系统是Ubuntu 18.04.2 LTS。

0 个答案:

没有答案