张量流出错:未创建scatter_nd_op_gpu.cu.o

时间:2018-04-16 23:51:21

标签: python tensorflow

我一直在尝试从源代码安装Tensorflow,但是我收到了这个错误:

Error limit reached.
100 errors detected in the compilation of "/tmp/tmpxft_000076fb_00000000-7_scatter_nd_op_gpu.cu.cpp1.ii".
Compilation terminated.
ERROR: /home/rosgori/Python/tengpu/tensorflow/tensorflow/core/kernels/BUILD:4149:1: output 'tensorflow/core/kernels/_objs/scatter_nd_op_gpu/tensorflow/core/kernels/scatter_nd_op_gpu.cu.o' was not created
ERROR: /home/rosgori/Python/tengpu/tensorflow/tensorflow/core/kernels/BUILD:4149:1: not all outputs were created or valid
Target //tensorflow/tools/pip_package:build_pip_package failed to build
INFO: Elapsed time: 667.700s, Critical Path: 71.44s
FAILED: Build did NOT complete successfully

系统信息

  • 我是否编写过自定义代码(与使用TensorFlow中提供的库存示例脚本相反):否
  • OS平台和发行版(例如,Linux Ubuntu 16.04):Ubuntu 16.04
  • 从(来源或二进制)安装的TensorFlow :来源
  • TensorFlow版本(使用下面的命令):master
  • Python版:3.6.4
  • Bazel版本(如果从源代码编译):0.12.0
  • GCC /编译器版本(如果从源代码编译):5.4.0
  • CUDA / cuDNN版本:8.0 / 7.0
  • GPU型号和内存:GeForce GT 740M; 2004MiB

重现的确切命令

bazel build --verbose_failures -c opt --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-msse4.1 --copt=-msse4.2 --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

当我说 master 时,我没有使用r1.8r1.7 master 的时间戳:

from datetime import datetime
datetime.utcnow()

我得到:datetime.datetime(2018, 4, 16, 23, 30, 23, 844800)

所以我的问题是:

该错误的含义是什么?可以修复吗?

修改

有时我会这样:

20 errors detected in the compilation of "/tmp/tmpxft_000016a4_00000000-7_gather_functor_gpu.cu.cpp1.ii".
ERROR: /home/rosgori/Python/tengpu/tensorflow/tensorflow/core/kernels/BUILD:1208:1: output 'tensorflow/core/kernels/_objs/gather_functor_gpu/tensorflow/core/kernels/gather_functor_gpu.cu.o' was not created
ERROR: /home/rosgori/Python/tengpu/tensorflow/tensorflow/core/kernels/BUILD:1208:1: not all outputs were created or valid
Target //tensorflow/tools/pip_package:build_pip_package failed to build
INFO: Elapsed time: 157.766s, Critical Path: 35.22s
FAILED: Build did NOT complete successfully

2 个答案:

答案 0 :(得分:1)

我通过以下方式解决了这些问题:

  1. 在我的Ubuntu 16.04上安装gcc4.9

    根据https://www.tensorflow.org/install/install_sources二进制构建包是使用gcc4构建的,因此gcc5 +存在一些不兼容问题

       sudo apt-get install python-software-properties
       sudo add-apt-repository ppa:ubuntu-toolchain-r/test
       sudo apt-get update
       sudo apt-get install gcc-4.9
       sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.9 50
       sudo apt-get install g++-4.9
       sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-4.9 50     
    
  2. 然后在运行configure时,将gcc的路径设为/usr/bin/gcc-4.9

    1. 修改/tensorflow/core/platform/macros.h

      请参阅https://github.com/tensorflow/tensorflow/issues/19203

      替换:

          #define TF_PREDICT_FALSE(x) (__builtin_expect(x, 0))
          #define TF_PREDICT_TRUE(x) (__builtin_expect(!!(x), 1))
      

          #define TF_PREDICT_FALSE(x) (x)
          #define TF_PREDICT_TRUE(x) (x)
      

答案 1 :(得分:1)

添加:--cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0"在bazel构建中得到推荐

例如

bazel build --config=opt --config=cuda --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" //tensorflow/tools/pip_package:build_pip_package