我已经安装了tensorflow 1.5:
pip install tensorflow==1.5
因为我有一个错误:
(ImportError: libcusolver.so.8.0: cannot open shared object file: No such file or directory
Failed to load the native TensorFlow runtime.)
在我用以下方法安装Keras之后:
pip install Keras
最后我用以下命令安装了TensorFlow gpu:
pip install tensorflow-gpu
运行我的代码时,我搜索了:
nvidia-smi
如果有我的流程,但没有。 我该怎么办?
答案 0 :(得分:0)
尝试卸载tenserflow-gpu的先前版本,并使用指定版本进行安装。 Tenserflow-gpu还安装tenserflow作为依赖项,因此您可能会有版本差异。有关此问题的更多信息,请参见其git repo。
pip uninstall tensorflow
pip uninstall tensorflow-gpu
pip install tensorflow==1.5.0
pip install tensorflow-gpu==1.5.0
答案 1 :(得分:0)
您安装了cuda和cudnn吗?并将env路径添加到.bashrc吗?
在〜/ .bashrc中设置CUDA路径:
# Cuda Nvidia path
$ export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64"
$ export CUDA_HOME=/usr/local/cuda
Pd:您只需要安装tensorflow-gpu版本,而不是cpu版本。
之后,您需要download a cuDNN version与您的Cuda版本兼容。例如,您可以通过ssh / putty下载cuda v7.1.3:
$ wget https://developer.download.nvidia.com/compute/machine-learning/cudnn/secure/v7.1.3/prod/8.0_20180414/cudnn-8.0-linux-x64-v7.1.tgz?2H2brULbeJyaEnHtkjrwUED5JR-vx6QBedn-A-nsa9Q73hUCfYCmWbFG_-eBnlAv3_CjRmI3qhRPJDiRz0wRC3oGviiYTX7M-H-RUiZQR_vuo21iCM5W-R0iaJOuwt0bmw-RFTg2XK_a8gdiV-uemVTH8Lf8-Q8rf3Msh52hznszsZKCP0hq2DvYNFuTSjyOSgPiH-3c_Th2uw
下载后,只需解压缩并将相关文件复制到CUDA安装目录即可。此示例适用于cuda v8.0和cudnn v7.1.3:
$ mv cudnn-8.0-linux-x64-v7.1.tgz?2H2brULbeJyaEnHtkjrwUED5JR-vx6QBedn-A-nsa9Q73hUCfYCmWbFG_-eBnlAv3_CjRmI3qhRPJDiRz0wRC3oGviiYTX7M-H-RUiZQR_vuo21iCM5W-R0iaJOuwt0bmw-RFTg2XK_a8gdiV-uemVTH8Lf8-Q8rf3Msh52hznszsZKCP0hq2DvYNFuTSjyOSgPiH-3c_Th2uw cudnn-8.0-linux-x64-v7.1.tgz
$ tar -zxvf cudnn-8.0-linux-x64-v7.1.tgz
$ sudo cp cuda/include/cudnn.h /usr/local/cuda/include/
$ sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64/
$ sudo chmod a+r /usr/local/cuda/include/cudnn.h
$ sudo chmod a+r /usr/local/cuda/lib64/libcudnn*
然后,您可以检查tensorflow是否在GPU上运行:
# Python
import tensorflow as tf
# Creates a graph.
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# Runs the op.
print(sess.run(c))
如果一切正常,您会得到以下信息:
2018-09-25 13:27:23.614538: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties:
name: GeForce GTX 1070 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.683
pciBusID: 0000:01:00.0
totalMemory: 7.93GiB freeMemory: 6.22GiB
2018-09-25 13:27:23.614552: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
2018-09-25 13:27:23.800175: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-09-25 13:27:23.800210: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929] 0
2018-09-25 13:27:23.800215: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0: N
2018-09-25 13:27:23.800393: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5997 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce GTX 1070 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1
2018-09-25 13:27:23.857141: I tensorflow/core/common_runtime/direct_session.cc:284] Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce GTX 1070 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1
MatMul: (MatMul): /job:localhost/replica:0/task:0/device:GPU:0
2018-09-25 13:27:23.857701: I tensorflow/core/common_runtime/placer.cc:886] MatMul: (MatMul)/job:localhost/replica:0/task:0/device:GPU:0
b: (Const): /job:localhost/replica:0/task:0/device:GPU:0
2018-09-25 13:27:23.857729: I tensorflow/core/common_runtime/placer.cc:886] b: (Const)/job:localhost/replica:0/task:0/device:GPU:0
a: (Const): /job:localhost/replica:0/task:0/device:GPU:0
2018-09-25 13:27:23.857736: I tensorflow/core/common_runtime/placer.cc:886] a: (Const)/job:localhost/replica:0/task:0/device:GPU:0
[[22. 28.]
[49. 64.]]