我正在尝试使用bazel为Android构建TF。 我注意到当我使用makefile构建TF时,c ++代码进行了优化,它比bazel生成的库快了近2倍。这可能是什么原因? 这里修改了tf_copts()
def tf_copts():
return ([
"-Wno-sign-compare",
"-fno-exceptions",
] +
if_cuda(["-DGOOGLE_CUDA=1"]) +
if_android_arm(["-mfpu=neon", "-mfloat-abi=softfp"]) +
if_x86(["-msse4.1"]) +
select({
"//tensorflow:android": [
"-DNDEBUG",
"-std=c++11",
"-DTF_LEAN_BINARY",
"-O2",
"-fno-rtti",
"-DGOOGLE_PROTOBUF_NO_RTTI",
"-DGOOGLE_PROTOBUF_NO_STATIC_INITIALIZER",
"-fPIE",
"-finline-functions",
"-funswitch-loops",
"-fpredictive-commoning",
"-fgcse-after-reload",
"-ftree-loop-distribute-patterns",
"-fvect-cost-model",
"-ftree-partial-pre",
"-fpeel-loops"
],
"//tensorflow:darwin": [],
"//tensorflow:windows": [
"/DLANG_CXX11",
"/D__VERSION__=\\\"MSVC\\\"",
"/DPLATFORM_WINDOWS",
"/DEIGEN_HAS_C99_MATH",
"/DTENSORFLOW_USE_EIGEN_THREADPOOL",
],
"//tensorflow:ios": ["-std=c++11"],
"//conditions:default": ["-pthread"]}))
这是我使用的构建命令。
bazel build -c opt //tensorflow/contrib/android:libtensorflow_inference.so --crosstool_top=//external:android/crosstool
--host_crosstool_top=@bazel_tools//tools/cpp:toolchain --cpu=armeabi-v7a
makefile中的c ++标志部分:
CXXFLAGS +=\
--sysroot $(NDK_ROOT)/platforms/android-$(ANDROID_API_VERSION)/arch-$(sysroot_arch) \
-Wno-narrowing \
-fPIE \
-DGOOGLE_PROTOBUF_NO_RTTI \
-DGOOGLE_PROTOBUF_NO_STATIC_INITIALIZER \
-DTF_LEAN_BINARY \
-O2 \
-finline-functions \
-funswitch-loops \
-fpredictive-commoning \
-fgcse-after-reload \
-ftree-loop-distribute-patterns \
-fvect-cost-model \
-ftree-partial-pre \
-fpeel-loops \
-mfloat-abi=softfp \
-mfpu=neon \
-march=armv7-a