我正在使用 Ubuntu 20.04 。我将Tensorflow-2.2.0升级到Tensorflow-2.3.0。当版本为 2.2.0 时,tensorflow很好地利用了GPU。但是升级到版本 2.3.0 后,它无法检测到GPU。
我已经从stackoverflow中看到了这个Link。这是 cuDNN 版本的问题。但是我需要cuDNN版本。
me_sajied@Kunai:~$ apt list | grep cudnn
WARNING: apt does not have a stable CLI interface. Use with caution in scripts.
libcudnn7-dev/now 7.6.5.32-1+cuda10.1 amd64 [installed,local]
libcudnn7/now 7.6.5.32-1+cuda10.1 amd64 [installed,local]
我还拥有所有必需的软件及其版本。
me_sajied@Kunai:~$ apt list | grep cuda-toolkit
WARNING: apt does not have a stable CLI interface. Use with caution in scripts.
cuda-toolkit-10-0/unknown 10.0.130-1 amd64
cuda-toolkit-10-1/unknown,now 10.1.243-1 amd64 [installed,automatic]
cuda-toolkit-10-2/unknown 10.2.89-1 amd64
cuda-toolkit-11-0/unknown,unknown 11.0.3-1 amd64
nvidia-cuda-toolkit-gcc/focal 10.1.243-3 amd64
nvidia-cuda-toolkit/focal 10.1.243-3 amd64
me_sajied@Kunai:~$ python3 --version
Python 3.8.2
LD_LIBRARY_PATH="/usr/local/cuda-10.1/lib64"
me_sajied@Kunai:~$ python3
Python 3.8.2 (default, Jul 16 2020, 14:00:26)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
2020-09-13 21:28:37.387327: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
>>>
>>> tf.test.is_gpu_available()
WARNING:tensorflow:From <stdin>:1: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
2020-09-13 21:28:48.806385: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-09-13 21:28:48.836251: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2699905000 Hz
2020-09-13 21:28:48.836637: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x3fde5f0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-09-13 21:28:48.836685: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-09-13 21:28:48.840030: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2020-09-13 21:28:48.882190: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-13 21:28:48.882582: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x408bd90 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-09-13 21:28:48.882606: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce 930MX, Compute Capability 5.0
2020-09-13 21:28:48.882796: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-13 21:28:48.883151: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce 930MX computeCapability: 5.0
coreClock: 1.0195GHz coreCount: 3 deviceMemorySize: 1.96GiB deviceMemoryBandwidth: 14.92GiB/s
2020-09-13 21:28:48.883196: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-09-13 21:28:48.883415: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcublas.so.10'; dlerror: libcublas.so.10: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/extras/CUPTI/lib64
2020-09-13 21:28:48.885196: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2020-09-13 21:28:48.885544: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2020-09-13 21:28:48.887160: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2020-09-13 21:28:48.888134: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2020-09-13 21:28:48.891565: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7
2020-09-13 21:28:48.891603: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1753] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2020-09-13 21:28:48.891625: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-13 21:28:48.891632: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263] 0
2020-09-13 21:28:48.891639: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0: N
False
>>>
答案 0 :(得分:0)
在您的~/.bashrc
中添加:
LD_LIBRARY_PATH=/usr/local/cuda-10.1/lib64
如果lib64文件夹的位置不同,则需要相应地进行调整。
作为旁注,如果要频繁在多个CUDA版本之间切换,还可以直接在终端中为特定命令设置环境变量,例如:
LD_LIBRARY_PATH=/usr/local/cuda-10.1/lib64 python myprogram_which_needs_10_1.py
然后,如果要切换到其他版本,只需在命令前修改路径即可。
答案 1 :(得分:-2)
2020-09-13 21:28:48.883415:W tensorflow / stream_executor / platform / default / dso_loader.cc:59]无法加载动态库'libcublas.so.10'; dlerror:libcublas.so.10:无法打开共享对象文件:没有这样的文件或目录;
就我而言,这是由于安装导致的
libcublas10
的 CUDA 10.2 的libcublas-dev
和apt upgrade
。
有关此问题的我的解决方案如下。
$ sudo apt install --reinstall libcublas10=10.2.1.243-1 libcublas-dev=10.2.1.243-1
并防止出现可升级的候选对象。
$ sudo apt-mark hold libcublas10
$ sudo apt-mark hold libcublas-dev