构建失败 - gpu服务器中的Tensorflow - 错误 - 无法找到@ local_config_cuda // crosstool

时间:2017-08-01 22:42:59

标签: tensorflow

我正在使用基于gpu的服务器,配备GTX-1080和Ubuntu 16.04LTS。使用正常的tensorflow安装,我在运行应用程序时得到以下警告 -

2017-08-01 14:49:57.232126: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-08-01 14:49:57.232157: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-08-01 14:49:57.232162: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-08-01 14:49:57.232165: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-08-01 14:49:57.232169: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.

当尝试构建最新代码 版本 - v1.3.0-rc1-531-gcd4c17e 时,本地构建失败并显示以下给出的错误消息

user@devbox:~/Workouts/tensorflow$ bazel build --config=opt --config=cuda --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" ./tensorflow/tools/pip_package:build_pip_package 
.......
ERROR: no such package '@local_config_cuda//crosstool': Traceback (most recent call last):
File "/home/user/Workouts/tensorflow/third_party/gpus/cuda_configure.bzl", line 1039
    _create_local_cuda_repository(repository_ctx)
File "/home/user/Workouts/tensorflow/third_party/gpus/cuda_configure.bzl", line 976, in _create_local_cuda_repository
    _host_compiler_includes(repository_ctx, cc)
File "/home/user/Workouts/tensorflow/third_party/gpus/cuda_configure.bzl", line 145, in _host_compiler_includes
    get_cxx_inc_directories(repository_ctx, cc)
File "/home/user/Workouts/tensorflow/third_party/gpus/cuda_configure.bzl", line 120, in get_cxx_inc_directories
    set(includes_cpp)
depsets cannot contain mutable items
INFO: Elapsed time: 5.488s
FAILED: Build did NOT complete successfully (3 packages loaded)

下面给出了关于平台和配置的一些额外细节

user@gpu-devbox:~/Workouts/tensorflow$ python --version
Python 2.7.12

user@gpu-devbox:~/Workouts/tensorflow$ gcc --version
gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

user@gpu-devbox:~/Workouts/tensorflow$ bazel version
Build label: 0.5.3
Build target: bazel-out/local-fastbuild/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Fri Jul 28 08:34:59 2017 (1501230899)
Build timestamp: 1501230899
Build timestamp as int: 1501230899

配置准备如下:

user@gpu-devbox:~/Workouts/tensorflow$ ./configure 
WARNING: Running Bazel server needs to be killed, because the startup options are different.
Please specify the location of python. [Default is /usr/bin/python]: 
Found possible Python library paths:
/usr/local/lib/python2.7/dist-packages
/usr/lib/python2.7/dist-packages
Please input the desired Python library path to use.  Default is /usr/local/lib/python2.7/dist-packages
Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]: 
jemalloc as malloc support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Google Cloud Platform support? [y/N]: 
No Google Cloud Platform support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Hadoop File System support? [y/N]: 
No Hadoop File System support will be enabled for TensorFlow.

Do you wish to build TensorFlow with XLA JIT support? [y/N]: 
No XLA JIT support will be enabled for TensorFlow.

Do you wish to build TensorFlow with VERBS support? [y/N]: 
No VERBS support will be enabled for TensorFlow.

Do you wish to build TensorFlow with OpenCL support? [y/N]: 
No OpenCL support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: Y
CUDA support will be enabled for TensorFlow.

Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 8.0]: 
Please specify the location where CUDA 8.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 
"Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 6.0]: 
Please specify the location where cuDNN 6 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at:     https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 6.1,6.1,6.1,6.1]6.1
Do you want to use clang as CUDA compiler? [y/N]: 
nvcc will be used as CUDA compiler.

Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: 
Do you wish to build TensorFlow with MPI support? [y/N]: 
No MPI support will be enabled for TensorFlow.

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]: 
Add "--config=mkl" to your bazel command to build with MKL support.
Please note that MKL on MacOS or windows is still not supported.
If you would like to use a local MKL instead of downloading, please set the environment variable "TF_MKL_ROOT" every time before build.
Configuration finished

其他一些细节 -

user@gpu-devbox:~/Workouts/tensorflow$ echo $CUDA_HOME
/usr/local/cuda-8.0
user@gpu-devbox:~/Workouts/tensorflow$ echo $LD_LIBRARY_PATH
/usr/local/cuda-8.0/lib64

我错过了什么吗?

1 个答案:

答案 0 :(得分:0)

尝试将此添加到您的构建命令

--crosstool_top=@local_config_cuda//crosstool:toolchain

这里是TF + TF服务的所有Dockerfiles,包括CPU,GPU,AVX等。

https://github.com/fluxcapacitor/pipeline/tree/master/package.ml/tensorflow/16d39e9-d690fdd

命名约定是tensorflow- [tf_git_hash] - [tf_serving_git_hash]。这些Dockerfiles与几天前一样最新。

另一个好的资源是TensorFlow Jenkins / CI页面:  http://ci.tensorflow.org/

他们的构建是大量参数化的,所以他们适应你自己的环境有点棘手,但绝对帮助我们进入上面提到的Dockerfile。