TensorFlow 1.11无法使用CUDA 8构建。我尝试在github上打开一个问题(问题在Github#23256 [https://github.com/tensorflow/tensorflow/issues/23256]上打开),但是tensorflow团队的响应是将CUDA升级到9或将Tensorflow降级到1.10。 ,这不是我的选择。试图找到一种使TF1.11与CUDA 8配合使用的方法。
尝试在GeForce 1060 3GB GPU上使用TF 1.11和CUDA 8构建docker容器。在构建中不断发生错误。研究了Github发行号22729(#22729),但是解决方法不适用于TF 1.11,这就是需要的。 docker文件也在下面。您能提供的任何帮助将不胜感激。
系统信息
OS平台和发行版(例如Linux Ubuntu 16.04):Linux Ubuntu 16.04 从(源或二进制)安装TensorFlow:源
TensorFlow版本:TF 1.11
Python版本:2.7
使用virtualenv安装吗?点子? conda ?: Docker
Bazel版本(如果从源代码编译):0.15.0
GCC /编译器版本(如果从源代码编译):7.3.0
CUDA / cuDNN版本:8.0 / 7
GPU型号和内存:GeForce GTX 1060 3GB
提供在遇到问题之前执行的命令/步骤的确切顺序 sudo docker build --no-cache。 -f Dockerfile.tf-1.11-py27-gpu.txt -t tf-1.11-py27-gpu
谢谢你, 凯尔
Dockerfile.tf-1.11-py27-gpu
FROM nvidia/cuda:8.0-cudnn7-devel-ubuntu16.04
LABEL maintainer="Craig Citro <craigcitro@google.com>; Modified for Cuda 8 by Jack Harris"
RUN apt-get update && apt-get install -y --allow-downgrades --allow-change-held-packages --no-install-recommends \
build-essential \
cuda-command-line-tools-8-0 \
cuda-cublas-dev-8-0 \
cuda-cudart-dev-8-0 \
cuda-cufft-dev-8-0 \
cuda-curand-dev-8-0 \
cuda-cusolver-dev-8-0 \
cuda-cusparse-dev-8-0 \
curl \
git \
libcudnn7=7.2.1.38-1+cuda8.0 \
libcudnn7-dev=7.2.1.38-1+cuda8.0 \
libnccl2=2.2.13-1+cuda8.0 \
libnccl-dev=2.2.13-1+cuda8.0 \
libcurl3-dev \
libfreetype6-dev \
libhdf5-serial-dev \
libpng12-dev \
libzmq3-dev \
pkg-config \
python-dev \
rsync \
software-properties-common \
unzip \
zip \
zlib1g-dev \
wget \
&& \
rm -rf /var/lib/apt/lists/* && \
find /usr/local/cuda-8.0/lib64/ -type f -name 'lib*_static.a' -not -name 'libcudart_static.a' -delete && \
rm -f /usr/lib/x86_64-linux-gnu/libcudnn_static_v7.a
RUN apt-get update && \
apt-get install nvinfer-runtime-trt-repo-ubuntu1604-4.0.1-ga-cuda8.0 && \
apt-get update && \
apt-get install libnvinfer4=4.1.2-1+cuda8.0 && \
apt-get install libnvinfer-dev=4.1.2-1+cuda8.0
# Link NCCL libray and header where the build script expects them.
RUN mkdir /usr/local/cuda-8.0/lib && \
ln -s /usr/lib/x86_64-linux-gnu/libnccl.so.2 /usr/local/cuda/lib/libnccl.so.2 && \
ln -s /usr/include/nccl.h /usr/local/cuda/include/nccl.h
# TODO(tobyboyd): Remove after license is excluded from BUILD file.
#RUN gunzip /usr/share/doc/libnccl2/NCCL-SLA.txt.gz && \
# cp /usr/share/doc/libnccl2/NCCL-SLA.txt /usr/local/cuda/
# Add External Mount Points
RUN mkdir -p /external_lib
RUN mkdir -p /external_bin
RUN curl -fSsL -O https://bootstrap.pypa.io/get-pip.py && \
python get-pip.py && \
rm get-pip.py
RUN pip --no-cache-dir install \
ipykernel \
jupyter \
keras_applications==1.0.5 \
keras_preprocessing==1.0.3 \
matplotlib \
numpy \
pandas \
scipy \
sklearn \
mock \
&& \
python -m ipykernel.kernelspec
# Set up our notebook config.
#COPY jupyter_notebook_config.py /root/.jupyter/
# Jupyter has issues with being run directly:
# https://github.com/ipython/ipython/issues/7062
# We just add a little wrapper script.
# COPY run_jupyter.sh /
# Set up Bazel.
# Running bazel inside a `docker build` command causes trouble, cf:
# https://github.com/bazelbuild/bazel/issues/134
# The easiest solution is to set up a bazelrc file forcing --batch.
RUN echo "startup --batch" >>/etc/bazel.bazelrc
# Similarly, we need to workaround sandboxing issues:
# https://github.com/bazelbuild/bazel/issues/418
RUN echo "build --spawn_strategy=standalone --genrule_strategy=standalone" \
>>/etc/bazel.bazelrc
# Install the most recent bazel release.
ENV BAZEL_VERSION 0.15.0
WORKDIR /
RUN mkdir /bazel && \
cd /bazel && \
curl -H "User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36" -fSsL -O https://github.com/bazelbuild/bazel/releases/download/$BAZEL_VERSION/bazel-$BAZEL_VERSION-installer-linux-x86_64.sh && \
curl -H "User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36" -fSsL -o /bazel/LICENSE.txt https://raw.githubusercontent.com/bazelbuild/bazel/master/LICENSE && \
chmod +x bazel-*.sh && \
./bazel-$BAZEL_VERSION-installer-linux-x86_64.sh && \
cd / && \
rm -f /bazel/bazel-$BAZEL_VERSION-installer-linux-x86_64.sh
# Download and build TensorFlow.
RUN git clone http://github.com/tensorflow/tensorflow --branch r1.11 --depth=1
WORKDIR /tensorflow
RUN sed -i 's/^#if TF_HAS_.*$/#if !defined(__NVCC__)/g' tensorflow/core/platform/macros.h
ENV TF_NCCL_VERSION=2
#RUN /bin/echo -e "/usr/bin/python\n\nn\nn\nn\nn\nn\nn\nn\nn\nn\ny\n8.0\n/usr/local/cuda\n7.0\n/usr/local/cuda\n\n\n\nn\n\nn\n-march=native\nn\n" | ./configure
RUN /bin/echo -e "/usr/bin/python\n\nn\nn\nn\nn\nn\nn\nn\nn\nn\nn\ny\n8.0\n/usr/local/cuda\n7.0\n/usr/local/cuda\nn\n\n\n\n\n\nn\n\nn\n-march=native\nn\n" | ./configure
#RUN /bin/echo -e "\n\nn\nn\nn\nn\nn\n\n\n\n\n\n\n\n\n\n\n\n-march=native\nn\n" | ./configure
# Configure the build for our CUDA configuration.
ENV CI_BUILD_PYTHON python
ENV PATH /external_bin:$PATH
ENV LD_LIBRARY_PATH /external_lib:/usr/local/cuda/extras/CUPTI/lib64:$LD_LIBRARY_PATH
ENV TF_NEED_CUDA 1
ENV TF_NEED_TENSORRT 1
ENV TF_CUDA_COMPUTE_CAPABILITIES=3.0,3.5,5.2,6.0,6.1
ENV TF_CUDA_VERSION=8.0
ENV TF_CUDNN_VERSION=7
# https://github.com/tensorflow/tensorflow/issues/17801
RUN ln -s /usr/local/cuda/lib64/stubs/libcuda.so /usr/local/cuda/lib64/stubs/libcuda.so.1 && \
ln -s /usr/local/cuda/nvvm/libdevice/libdevice.compute_50.10.bc /usr/local/cuda/nvvm/libdevice/libdevice.10.bc && \
LD_LIBRARY_PATH=/usr/local/cuda/lib64/stubs:${LD_LIBRARY_PATH} \
tensorflow/tools/ci_build/builds/configured GPU \
bazel build -c opt --copt=-mavx --config=cuda \
--cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" \
tensorflow/tools/pip_package/build_pip_package && \
rm /usr/local/cuda/lib64/stubs/libcuda.so.1
RUN bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/pip
RUN pip --no-cache-dir install --upgrade /tmp/pip/tensorflow-*.whl && \
rm -rf /tmp/pip && \
rm -rf /root/.cache
# Clean up pip wheel and Bazel cache when done.
WORKDIR /root
# TensorBoard
EXPOSE 6006
# IPython
EXPOSE 8888
CMD [ "/bin/bash" ]
tf11cuda8.log-附加到github问题的日志(太长了,无法在此处发布)