安装TensorFlow-GPU

时间:2017-10-07 13:11:56

标签: tensorflow nvidia tensorflow-gpu

我尝试安装tensorflow-gpu。问题是我有nvidia-375.82驱动程序,而tensorflow需要375.66。

当我收到此错误时

ImportError: libnvidia-fatbinaryloader.so.375.66: cannot open shared object file: No such file or directory

我试图建立链接

sudo ln -s /usr/lib/nvidia-375/libnvidia-fatbinaryloader.so.375.82 /usr/lib/nvidia-375/libnvidia-fatbinaryloader.so.375.66

有助于避免ImportError,但仅此而已。如果我尝试运行smth

import tensorflow as tf

# Creates a graph.
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# Runs the op.
print(sess.run(c))

我通过cpu获得结果并打印

2017-10-07 15:56:03.329769: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-10-07 15:56:03.329832: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-10-07 15:56:03.329850: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-10-07 15:56:03.329864: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-10-07 15:56:03.329878: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-10-07 15:56:03.429055: E tensorflow/stream_executor/cuda/cuda_driver.cc:406] failed call to cuInit: CUDA_ERROR_NO_DEVICE
2017-10-07 15:56:03.429198: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:158] retrieving CUDA diagnostic information for host: sklert-new-comp
2017-10-07 15:56:03.429226: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:165] hostname: sklert-new-comp
2017-10-07 15:56:03.429317: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] libcuda reported version is: 375.66.0
2017-10-07 15:56:03.429384: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:369] driver version file contents: """NVRM version: NVIDIA UNIX x86_64 Kernel Module  375.82  Wed Jul 19 21:16:49 PDT 2017
GCC version:  gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) 
"""
2017-10-07 15:56:03.429446: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:193] kernel reported version is: 375.82.0
2017-10-07 15:56:03.429473: E tensorflow/stream_executor/cuda/cuda_diagnostics.cc:303] kernel version 375.82.0 does not match DSO version 375.66.0 -- cannot find working devices in this configuration
Device mapping: no known devices.
2017-10-07 15:56:03.430336: I tensorflow/core/common_runtime/direct_session.cc:300] Device mapping:

MatMul: (MatMul): /job:localhost/replica:0/task:0/cpu:0
2017-10-07 15:56:03.467133: I tensorflow/core/common_runtime/simple_placer.cc:872] MatMul: (MatMul)/job:localhost/replica:0/task:0/cpu:0
b: (Const): /job:localhost/replica:0/task:0/cpu:0
2017-10-07 15:56:03.467201: I tensorflow/core/common_runtime/simple_placer.cc:872] b: (Const)/job:localhost/replica:0/task:0/cpu:0
a: (Const): /job:localhost/replica:0/task:0/cpu:0
2017-10-07 15:56:03.467226: I tensorflow/core/common_runtime/simple_placer.cc:872] a: (Const)/job:localhost/replica:0/task:0/cpu:0
[[ 22.  28.]
 [ 49.  64.]]

有没有办法在没有降级的情况下使用tensorflow和gpu?

... 似乎问题不在张量流中,而在于nvidia-drivers

sudo dmesg | grep NVRM

[    1.267417] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  375.82  Wed Jul 19 21:16:49 PDT 2017 (using threaded interrupts)
[  108.803115] NVRM: API mismatch: the client has the version 375.66, but
               NVRM: this kernel module has the version 375.82.  Please
               NVRM: make sure that this kernel module and all NVIDIA driver
               NVRM: components have the same version.
[ 1419.021917] NVRM: API mismatch: the client has the version 375.66, but
               NVRM: this kernel module has the version 375.82.  Please
               NVRM: make sure that this kernel module and all NVIDIA driver
               NVRM: components have the same version.

有些驱动程序有不同的版本:

locate 375.66
/usr/lib/i386-linux-gnu/libcuda.so.375.66
/usr/lib/i386-linux-gnu/libnvidia-opencl.so.375.66
/usr/lib/nvidia-375/libnvidia-fatbinaryloader.so.375.66
/usr/lib/x86_64-linux-gnu/libcuda.so.375.66
/usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.375.66
/usr/lib32/nvidia-375/libnvidia-fatbinaryloader.so.375.66

0 个答案:

没有答案