尝试从源代码编译TensorFlow时出现以下错误。 任何想法都会有所帮助。
bazel-out/host/bin/_solib_local/_U_S_Stensorflow_Spython_Cgen_Unn_Uops_Upy_Uwrappers_Ucc___Utensorflow/libtensorflow_framework.so: undefined reference to `cublasGemmEx@libcublas.so.9.0'
bazel-out/host/bin/_solib_local/_U_S_Stensorflow_Spython_Cgen_Unn_Uops_Upy_Uwrappers_Ucc___Utensorflow/libtensorflow_framework.so: undefined reference to `cublasZhpmv_v2@libcublas.so.9.0'
bazel-out/host/bin/_solib_local/_U_S_Stensorflow_Spython_Cgen_Unn_Uops_Upy_Uwrappers_Ucc___Utensorflow/libtensorflow_framework.so: undefined reference to `cufftExecD2Z@libcufft.so.9.0'
bazel-out/host/bin/_solib_local/_U_S_Stensorflow_Spython_Cgen_Unn_Uops_Upy_Uwrappers_Ucc___Utensorflow/libtensorflow_framework.so: undefined reference to `cublasSrotg_v2@libcublas.so.9.0'
bazel-out/host/bin/_solib_local/_U_S_Stensorflow_Spython_Cgen_Unn_Uops_Upy_Uwrappers_Ucc___Utensorflow/libtensorflow_framework.so: undefined reference to `cufftExecR2C@libcufft.so.9.0'
bazel-out/host/bin/_solib_local/_U_S_Stensorflow_Spython_Cgen_Unn_Uops_Upy_Uwrappers_Ucc___Utensorflow/libtensorflow_framework.so: undefined reference to `cublasSsyrk_v2@libcublas.so.9.0'
bazel-out/host/bin/_solib_local/_U_S_Stensorflow_Spython_Cgen_Unn_Uops_Upy_Uwrappers_Ucc___Utensorflow/libtensorflow_framework.so: undefined reference to `cublasDgemm_v2@libcublas.so.9.0'
bazel-out/host/bin/_solib_local/_U_S_Stensorflow_Spython_Cgen_Unn_Uops_Upy_Uwrappers_Ucc___Utensorflow/libtensorflow_framework.so: undefined reference to `cufftSetWorkArea@libcufft.so.9.0'
bazel-out/host/bin/_solib_local/_U_S_Stensorflow_Spython_Cgen_Unn_Uops_Upy_Uwrappers_Ucc___Utensorflow/libtensorflow_framework.so: undefined reference to `cublasChemm_v2@libcublas.so.9.0'
bazel-out/host/bin/_solib_local/_U_S_Stensorflow_Spython_Cgen_Unn_Uops_Upy_Uwrappers_Ucc___Utensorflow/libtensorflow_framework.so: undefined reference to `cublasZher2k_v2@libcublas.so.9.0'
bazel-out/host/bin/_solib_local/_U_S_Stensorflow_Spython_Cgen_Unn_Uops_Upy_Uwrappers_Ucc___Utensorflow/libtensorflow_framework.so: undefined reference to `cufftExecC2C@libcufft.so.9.0'
bazel-out/host/bin/_solib_local/_U_S_Stensorflow_Spython_Cgen_Unn_Uops_Upy_Uwrappers_Ucc___Utensorflow/libtensorflow_framework.so: undefined reference to `curandSetStream@libcurand.so.9.0'
bazel-out/host/bin/_solib_local/_U_S_Stensorflow_Spython_Cgen_Unn_Uops_Upy_Uwrappers_Ucc___Utensorflow/libtensorflow_framework.so: undefined reference to `cublasDrotm_v2@libcublas.so.9.0'
bazel-out/host/bin/_solib_local/_U_S_Stensorflow_Spython_Cgen_Unn_Uops_Upy_Uwrappers_Ucc___Utensorflow/libtensorflow_framework.so: undefined reference to `curandSetPseudoRandomGeneratorSeed@libcurand.so.9.0'
答案 0 :(得分:22)
我们的构建中似乎存在错误。我能够在我的机器上重现相同的内容。看起来LD_LIBRARY_PATH
的值并不总是在bazel构建期间正确传播。就我而言,当我使用此命令时,我能够成功构建:
bazel build --config=opt --config=cuda tensorflow/tools/pip_package:build_pip_package --action_env="LD_LIBRARY_PATH=${LD_LIBRARY_PATH}"
答案 1 :(得分:2)
我昨天遇到同样的错误,同时试图从源头建立一个显然有效的cuda 9.0的张量流。就我而言,git clean
和action_env
的任何组合都没有帮助 - ld
通过bazel会一直拒绝承认cuda libs。
我最终按照this thread中的说明操作:以root身份创建一个文件/etc/ld.so.conf.d/cuda.conf
,其中包含一行
/usr/local/cuda/lib64
(假设您的/usr/local/cuda/
与您的具体cuda目录相关联,例如/usr/local/cuda-9.0/
。)
然后发出sudo ldconfig
。
有了这个,构建就完成了,而tensorflow正在使用我的GPU。
答案 2 :(得分:0)
试图让这个问题更容易搜索:我收到的错误信息也包含在顶部:
libcublas.so.9.0, needed by bazel-out/[...]/libtensorflow_framework.so, not found (try using -rpath or -rpath-link)
等等。
当我遇到此问题时,我首先将/usr/local/cuda/lib64
和/usr/local/cuda/extras/CUPTI/lib64
添加到LD_LIBRARY_PATH
并尝试重建(不使用--action_env
)。没有工作。
然后我再次进行了一次干净的重新配置和构建,没有--action_env
,并且它有效。我通过git clean -xdf
清理了我的存储库,谨慎地说,这将会破坏存储库中git不知道的所有文件。 :)
也许--action_env
会避免进行干净的重建,我不知道。但是,如果在进行第一次构建之前,这些库位于LD_LIBRARY_PATH
,我希望您不需要--action_env
。
答案 3 :(得分:0)
在发生错误之后,我在/usr/local/cuda/lib64
后面附加了LD_LIBRARY_PATH
。没用然后,我再次用.tf_configure.bazelrc
修改了build --action_env LD_LIBRARY_PATH=..."
。重新编译项目并通过!