tensorflow异常CUDA相关错误

时间:2017-06-05 18:15:20

标签: python-2.7 tensorflow cuda gpu

我使用张量流已近两年了,从未见过这个。在一个新的Ubuntu盒子上,我在virtualenv中重新安装了tensorflow。当我运行示例代码时,我收到了无效设备错误。它是在调用tf.Session()时发生的。

WARNING:tensorflow:From full_code.py:27: initialize_all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02.
Instructions for updating:
Use `tf.global_variables_initializer` instead.
2017-06-05 11:01:55.853842: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-05 11:01:55.853867: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-05 11:01:55.853876: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-06-05 11:01:55.853886: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-05 11:01:55.853893: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-06-05 11:01:55.937978: I tensorflow/core/common_runtime/gpu/gpu_device.cc:887] Found device 0 with properties: 
name: GeForce GTX 660 Ti
major: 3 minor: 0 memoryClockRate (GHz) 1.0455
pciBusID 0000:04:00.0
Total memory: 2.95GiB
Free memory: 2.91GiB
2017-06-05 11:01:55.938063: W tensorflow/stream_executor/cuda/cuda_driver.cc:485] creating context when one is currently active; existing: 0x19e5370
2017-06-05 11:01:56.014220: E tensorflow/core/common_runtime/direct_session.cc:137] Internal: failed initializing StreamExecutor for CUDA device ordinal 1: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_INVALID_DEVICE

这是完整的规范。

Ubuntu 14.04
CUDA 8.0
GeForce GTX 660 Ti 
python 3.4.3

1 个答案:

答案 0 :(得分:1)

感谢谷歌的某些人,我发现了什么问题。在这款戴尔包装盒中,有两块Nvidia显卡。第一个是制造商,是NVS 310卡。据我所知,这个没有任何计算能力,我从不打算使用它。

然后我添加了第二张卡,GTX 660 Ti,我打算用这个卡进行所有计算。

当调用Tensorflow时,它默认为设备0,即NVS 310.当然它会抛出无效错误。

当我这样做时,

CUDA_VISIBLE_DEVICES = 1 python myscript.py

它有效。