Tensorflow 1.5未经优化的构建用于调试:编译失败

时间:2018-02-08 18:58:10

标签: c++ debugging tensorflow gpu

构建如下以执行Tensorflow 1.5后端的调试:

bazel build -c opt --config cuda -c dbg --strip=never //tensorflow/tools/pip_package:build_pip_package -s

给出:

INFO: From Compiling tensorflow/contrib/nccl/kernels/nccl_manager.cc:
/usr/include/c++/6/bits/stl_pair.h(327): error: calling a __host__ function("std::_Rb_tree_const_iterator< ::tensorflow::NcclManager::NcclStream *> ::_Rb_tree_const_iterator") from a __device__ function("std::pair< ::std::_Rb_tree_const_iterator< ::tensorflow::NcclManager::NcclStream *> , bool> ::pair< ::std::_Rb_tree_iterator< ::tensorflow::NcclManager::NcclStream *>  &, bool &, (bool)1> ") is not allowed

/usr/include/c++/6/bits/stl_pair.h(327): error: identifier "std::_Rb_tree_const_iterator< ::tensorflow::NcclManager::NcclStream *> ::_Rb_tree_const_iterator" is undefined in device code

/usr/include/c++/6/bits/stl_algobase.h(1009): error: calling a __host__ function("__builtin_clzl") from a __device__ function("std::__lg") is not allowed

我想我们可以通过添加一些内联标志来正确折叠这些函数,但不保留其他优化吗?

0 个答案:

没有答案