无法打开tensorflow会话

时间:2017-09-24 15:15:47

标签: tensorflow gpu nvidia tensorflow-gpu

当我尝试打开tensorflow会话时,出现以下错误:

2017-09-24 10:49:20.526121: I tensorflow/core/common_runtime/gpu/gpu_device.cc:955] Found device 0 with properties: 
name: GeForce GTX 970
major: 5 minor: 2 memoryClockRate (GHz) 1.342
pciBusID 0000:03:00.0
Total memory: 3.94GiB
Free memory: 3.87GiB
2017-09-24 10:49:20.599629: W tensorflow/stream_executor/cuda/cuda_driver.cc:523] A non-primary context 0x3dcf7e0 exists before initializing the StreamExecutor. We haven't verified StreamExecutor works with that.
2017-09-24 10:49:20.599947: E tensorflow/core/common_runtime/direct_session.cc:171] Internal: failed initializing StreamExecutor for CUDA device ordinal 1: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_INVALID_DEVICE
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/user/python-envs/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1486, in __init__
    super(Session, self).__init__(target, graph, config=config)
  File "/home/user/python-envs/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 621, in __init__
    self._session = tf_session.TF_NewDeprecatedSession(opts, status)
  File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
    self.gen.next()
  File "/home/user/python-envs/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InternalError: Failed to create session.

我的系统中有两个GPUS。一个用于显示,另一个用于计算:

GPU0 (display) : Nvidia NVS 310 
GPU1 (compute) : Nvidia Geforce GTX 970
Graphics Driver: 384.66
CUDA version   : 8
cuDNN version  : v6 for CUDA 8 (April 27, 2017)
Operating Sys. : Ubuntu 16.04

还有其他人有这个问题吗?如何进行调试/修复?

注意:我确实尝试在Github上打开一个问题。但在我结束之前,我被要求寻找早先在SO上提出的问题或者在那里问。

谢谢!

1 个答案:

答案 0 :(得分:0)

似乎tensorflow试图抓住所有可用的GPU进行计算,如下面链接的Github问题所示。将环境变量CUDA_VISIBLE_DEVICES设置为我想用于计算的设备就可以了。

Github上可能存在的相关问题包括:Segmentation fault when GPUs are already used

可以通过运行nvidia-smi实用程序检查Ubuntu上的设备ID。