无法使用Ubuntu 16.04和CUDA 8.0

时间:2016-09-29 11:29:26

标签: ubuntu tensorflow

是否有人成功安装了TensorFlow以及以下组合?

  • Ubuntu 16.04(刚安装并升级)
  • Nvidia Driver:nvidia-370(sudo apt install nvidia-370)
  • Bazel 0.3.1(从官方软件包安装):

    chmod +x bazel-0.3.1-installer-linux-x86_64.sh
    ./bazel-0.3.1-installer-linux-x86_64.sh --user
    
  • gcc:5.4.0(默认为16.04)

  • CUDA 8.0.44(适用于GTX 1070,从没有驱动程序的官方.run文件安装)

    export CUDA_HOME=/usr/local/cuda-8.0
    export LD_LIBRARY_PATH="$CUDA_HOME/lib64:$LD_LIBRARY_PATH"
    export PATH="$CUDA_HOME/bin:$PATH"
    
  • cuDNN 5.1.5(从官方软件包安装)

  • tensorflow master(从官方github克隆)
  • python 2和python 3(从ubuntu官方源安装)

然后运行tensorflow / configure:

./configure
Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python3
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] N
No Google Cloud Platform support will be enabled for TensorFlow
Do you wish to build TensorFlow with Hadoop File System support? [y/N] 
No Hadoop File System support will be enabled for TensorFlow
Found possible Python library paths:
  /usr/local/lib/python2.7/dist-packages
  /usr/lib/python2.7/dist-packages
Please input the desired Python library path to use.  Default is [/usr/local/lib/python2.7/dist-packages]
/usr/lib/python3/dist-packages
Do you wish to build TensorFlow with GPU support? [y/N] y
GPU support will be enabled for TensorFlow
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: 
Please specify the Cuda SDK version you want to use, e.g. 7.0. [Leave empty to use system default]: 8.0
Please specify the location where CUDA 8.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 
Please specify the Cudnn version you want to use. [Leave empty to use system default]: 5.1.5
Please specify the location where cuDNN 5.1.5 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size.
[Default is: "3.5,5.2"]: 

并运行建筑物:

bazel build -c opt --config=cuda

立即发生了一些错误:

ERROR: /home/devymex/.cache/bazel/_bazel_devymex/05af4cc48fb50d1cc8f7e879f4c1ce83/external/local_config_cuda/crosstool/BUILD:4:1: Traceback (most recent call last):
    File "/home/devymex/.cache/bazel/_bazel_devymex/05af4cc48fb50d1cc8f7e879f4c1ce83/external/local_config_cuda/crosstool/BUILD", line 4
        error_gpu_disabled()
    File "/home/devymex/.cache/bazel/_bazel_devymex/05af4cc48fb50d1cc8f7e879f4c1ce83/external/local_config_cuda/crosstool/error_gpu_disabled.bzl", line 3, in error_gpu_disabled
        fail("ERROR: Building with --config=c...")
ERROR: Building with --config=cuda but TensorFlow is not configured to build with GPU support. Please re-run ./configure and enter 'Y' at the prompt to build with GPU support.
...

我试过:

  • 多次重新安装Ubuntu;
  • 多次重新安装Bazel(来自3方来源)。
  • 尝试不同的Bazel版本(0.3.1,0.3.0,frome官方套餐)
  • 切换到较早的CUDA版本8.0.27(CUDA 7不支持GTX 1070)
  • 尝试不同的cuda路径“/ usr / local / cuda”和“/usr/local/cuda-8.0”(所有这些都存在)
  • 重新运行./configure并重新运行bazel build -c opt --config = cuda
  • 在重新运行之前运行“bazel clean”./configure
  • 重新下载官方tensorflow包或重新克隆官方tensorflow github repo
  • 在./configure
  • 之前删除路径〜/ .config / bazel
  • python2和python3
  • 我已尝试过可能与此错误相关的所有操作。

但错误始终发生且信息相同。

我尝试使用tensorflow r0.10包(从git hub下载),错误信息变为:

ERROR: no such package '@local_config_cuda//crosstool': BUILD file not found on package path.
ERROR: no such package '@local_config_cuda//crosstool': BUILD file not found on package path.

使用tensorflow r0.10,上面尝试的操作对这两个错误消息没有影响。

经过四天的失败,我很沮丧。有人可以帮助我吗?谢谢!

我也将此问题发布到github: https://github.com/tensorflow/tensorflow/issues/2559

0 个答案:

没有答案