我正在尝试在 ubuntu 18 上训练模型,并且我遵循了 Tesorflow-GPU 的文档: https://www.tensorflow.org/install/gpu Ubuntu 18 CUDA 11 张量流-GPU 1.13 我遇到了这个问题:
2021-02-03 13:16:00.755944: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcudart.so.10.0'; dlerror: libcudart.so.10.0: cannot open shared object file: No such file or directory
2021-02-03 13:16:00.756245: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcublas.so.10.0'; dlerror: libcublas.so.10.0: cannot open shared object file: No such file or directory
2021-02-03 13:16:00.756534: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcufft.so.10.0'; dlerror: libcufft.so.10.0: cannot open shared object file: No such file or directory
2021-02-03 13:16:00.756834: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcurand.so.10.0'; dlerror: libcurand.so.10.0: cannot open shared object file: No such file or directory
2021-02-03 13:16:00.757106: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcusolver.so.10.0'; dlerror: libcusolver.so.10.0: cannot open shared object file: No such file or directory
2021-02-03 13:16:00.757389: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcusparse.so.10.0'; dlerror: libcusparse.so.10.0: cannot open shared object file: No such file or directory
2021-02-03 13:16:00.757674: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcudnn.so.7'; dlerror: libcudnn.so.7: cannot open shared object file: No such file or directory
2021-02-03 13:16:00.757800: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1663] Cannot dlopen some GPU libraries. Skipping registering GPU devices...
2021-02-03 13:16:00.757899: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-02-03 13:16:00.757992: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0
2021-02-03 13:16:00.758088: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N
2021-02-03 13:16:01.201726: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1412] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set. If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU. To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile.
从错误中我可以看到没有找到 CUDA 文件,并且在检查之后没有这样的文件。
答案 0 :(得分:0)
问题出在 Tensorflow 1.13 版上,我已将其更新为 2.4 版并且有效。