Question

我安装了2个nvidia GPU的ubuntu 16.04：

GPU 0: GeForce GT 610 (UUID: GPU-710e856e-358f-7b7d-95b7-e4eae7037c1f)
GPU 1: GeForce GTX TITAN X (UUID: GPU-5eacd6f3-f9e4-5795-c75c-26e34ced55ce)

nvidia-smi输出：

Sun Jun 10 17:21:47 2018       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.130                Driver Version: 384.130                   |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GT 610      Off  | 00000000:02:00.0 N/A |                  N/A |
| 40%   49C    P8    N/A /  N/A |    133MiB /  1985MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX TIT...  Off  | 00000000:03:00.0 Off |                  N/A |
| 22%   50C    P8    15W / 250W |      2MiB / 12207MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0                    Not Supported                                       |
+-----------------------------------------------------------------------------+

我已按照https://www.tensorflow.org/install/install_linux#InstallingAnaconda中的步骤为GPU安装基于anaconda的tensoflow。但是，如果我启动TF会话，则会出现以下错误：

Python 2.7.15 |Anaconda, Inc.| (default, May  1 2018, 23:32:55) 
[GCC 7.2.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> x = tf.Variable( "Hello..!" )
>>> sess = tf.Session()
2018-06-10 17:16:07.662527: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-06-10 17:16:07.843402: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties: 
name: GeForce GTX TITAN X major: 5 minor: 2 memoryClockRate(GHz): 1.076
pciBusID: 0000:03:00.0
totalMemory: 11.92GiB freeMemory: 11.80GiB
2018-06-10 17:16:07.880682: E tensorflow/core/common_runtime/direct_session.cc:154] Internal: failed initializing StreamExecutor for CUDA device ordinal 1: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_INVALID_DEVICE
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/miniconda2/envs/tf-gpu/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1560, in __init__
    super(Session, self).__init__(target, graph, config=config)
  File "/opt/miniconda2/envs/tf-gpu/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 633, in __init__
    self._session = tf_session.TF_NewSession(self._graph._c_graph, opts)
tensorflow.python.framework.errors_impl.InternalError: Failed to create session.

我错过了什么？如何摆脱这个错误？

Titan X上的Tensorflow 1.8：CUDA_ERROR_INVALID_DEVICE

0 个答案: