Question

TensorFlow 1.11无法使用CUDA 8构建。我尝试在github上打开一个问题（问题在Github＃23256 [https://github.com/tensorflow/tensorflow/issues/23256]上打开），但是tensorflow团队的响应是将CUDA升级到9或将Tensorflow降级到1.10。，这不是我的选择。试图找到一种使TF1.11与CUDA 8配合使用的方法。

尝试在GeForce 1060 3GB GPU上使用TF 1.11和CUDA 8构建docker容器。在构建中不断发生错误。研究了Github发行号22729（＃22729），但是解决方法不适用于TF 1.11，这就是需要的。 docker文件也在下面。您能提供的任何帮助将不胜感激。

系统信息

OS平台和发行版（例如Linux Ubuntu 16.04）：Linux Ubuntu 16.04 从（源或二进制）安装TensorFlow：源

TensorFlow版本：TF 1.11

Python版本：2.7

使用virtualenv安装吗？点子？ conda ？： Docker

Bazel版本（如果从源代码编译）：0.15.0

GCC /编译器版本（如果从源代码编译）：7.3.0

CUDA / cuDNN版本：8.0 / 7

GPU型号和内存：GeForce GTX 1060 3GB

提供在遇到问题之前执行的命令/步骤的确切顺序 sudo docker build --no-cache。 -f Dockerfile.tf-1.11-py27-gpu.txt -t tf-1.11-py27-gpu

谢谢你，凯尔

Dockerfile.tf-1.11-py27-gpu

FROM nvidia/cuda:8.0-cudnn7-devel-ubuntu16.04

LABEL maintainer="Craig Citro <craigcitro@google.com>; Modified for Cuda 8 by Jack Harris"

RUN apt-get update && apt-get install -y --allow-downgrades --allow-change-held-packages --no-install-recommends \
    build-essential \
    cuda-command-line-tools-8-0 \
    cuda-cublas-dev-8-0 \
    cuda-cudart-dev-8-0 \
    cuda-cufft-dev-8-0 \
    cuda-curand-dev-8-0 \
    cuda-cusolver-dev-8-0 \
    cuda-cusparse-dev-8-0 \
    curl \
    git \
    libcudnn7=7.2.1.38-1+cuda8.0 \
    libcudnn7-dev=7.2.1.38-1+cuda8.0 \
    libnccl2=2.2.13-1+cuda8.0 \
    libnccl-dev=2.2.13-1+cuda8.0 \
    libcurl3-dev \
    libfreetype6-dev \
    libhdf5-serial-dev \
    libpng12-dev \
    libzmq3-dev \
    pkg-config \
    python-dev \
    rsync \
    software-properties-common \
    unzip \
    zip \
    zlib1g-dev \
    wget \
    && \
rm -rf /var/lib/apt/lists/* && \
find /usr/local/cuda-8.0/lib64/ -type f -name 'lib*_static.a' -not -name 'libcudart_static.a' -delete && \
rm -f /usr/lib/x86_64-linux-gnu/libcudnn_static_v7.a

RUN apt-get update && \
    apt-get install nvinfer-runtime-trt-repo-ubuntu1604-4.0.1-ga-cuda8.0 && \
    apt-get update && \
    apt-get install libnvinfer4=4.1.2-1+cuda8.0 && \
    apt-get install libnvinfer-dev=4.1.2-1+cuda8.0

# Link NCCL libray and header where the build script expects them.
RUN mkdir /usr/local/cuda-8.0/lib &&  \
ln -s /usr/lib/x86_64-linux-gnu/libnccl.so.2 /usr/local/cuda/lib/libnccl.so.2 && \
ln -s /usr/include/nccl.h /usr/local/cuda/include/nccl.h

# TODO(tobyboyd): Remove after license is excluded from BUILD file.
#RUN gunzip /usr/share/doc/libnccl2/NCCL-SLA.txt.gz && \
#    cp /usr/share/doc/libnccl2/NCCL-SLA.txt /usr/local/cuda/

# Add External Mount Points
RUN mkdir -p /external_lib
RUN mkdir -p /external_bin

RUN curl -fSsL -O https://bootstrap.pypa.io/get-pip.py && \
    python get-pip.py && \
    rm get-pip.py

RUN pip --no-cache-dir install \
    ipykernel \
    jupyter \
    keras_applications==1.0.5 \
    keras_preprocessing==1.0.3 \
    matplotlib \
    numpy \
    pandas \
    scipy \
    sklearn \
    mock \ 
    && \
python -m ipykernel.kernelspec

# Set up our notebook config.
#COPY jupyter_notebook_config.py /root/.jupyter/

# Jupyter has issues with being run directly:
#   https://github.com/ipython/ipython/issues/7062
# We just add a little wrapper script.
# COPY run_jupyter.sh /

# Set up Bazel.

# Running bazel inside a `docker build` command causes trouble, cf:
#   https://github.com/bazelbuild/bazel/issues/134
# The easiest solution is to set up a bazelrc file forcing --batch.
RUN echo "startup --batch" >>/etc/bazel.bazelrc
# Similarly, we need to workaround sandboxing issues:
#   https://github.com/bazelbuild/bazel/issues/418
RUN echo "build --spawn_strategy=standalone --genrule_strategy=standalone" \
    >>/etc/bazel.bazelrc
# Install the most recent bazel release.
ENV BAZEL_VERSION 0.15.0
WORKDIR /
RUN mkdir /bazel && \
    cd /bazel && \
    curl -H "User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36     (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36" -fSsL -O https://github.com/bazelbuild/bazel/releases/download/$BAZEL_VERSION/bazel-$BAZEL_VERSION-installer-linux-x86_64.sh && \
curl -H "User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36" -fSsL -o /bazel/LICENSE.txt https://raw.githubusercontent.com/bazelbuild/bazel/master/LICENSE && \
chmod +x bazel-*.sh && \
./bazel-$BAZEL_VERSION-installer-linux-x86_64.sh && \
cd / && \
rm -f /bazel/bazel-$BAZEL_VERSION-installer-linux-x86_64.sh

# Download and build TensorFlow.
RUN git clone http://github.com/tensorflow/tensorflow --branch r1.11 --depth=1

WORKDIR /tensorflow

RUN sed -i 's/^#if TF_HAS_.*$/#if !defined(__NVCC__)/g' tensorflow/core/platform/macros.h

ENV TF_NCCL_VERSION=2

#RUN /bin/echo -e "/usr/bin/python\n\nn\nn\nn\nn\nn\nn\nn\nn\nn\ny\n8.0\n/usr/local/cuda\n7.0\n/usr/local/cuda\n\n\n\nn\n\nn\n-march=native\nn\n" | ./configure
RUN /bin/echo -e "/usr/bin/python\n\nn\nn\nn\nn\nn\nn\nn\nn\nn\nn\ny\n8.0\n/usr/local/cuda\n7.0\n/usr/local/cuda\nn\n\n\n\n\n\nn\n\nn\n-march=native\nn\n" | ./configure
#RUN /bin/echo -e "\n\nn\nn\nn\nn\nn\n\n\n\n\n\n\n\n\n\n\n\n-march=native\nn\n" | ./configure

# Configure the build for our CUDA configuration.
ENV CI_BUILD_PYTHON python
ENV PATH /external_bin:$PATH
ENV LD_LIBRARY_PATH /external_lib:/usr/local/cuda/extras/CUPTI/lib64:$LD_LIBRARY_PATH
ENV TF_NEED_CUDA 1
ENV TF_NEED_TENSORRT 1
ENV TF_CUDA_COMPUTE_CAPABILITIES=3.0,3.5,5.2,6.0,6.1
ENV TF_CUDA_VERSION=8.0
ENV TF_CUDNN_VERSION=7

# https://github.com/tensorflow/tensorflow/issues/17801
RUN ln -s /usr/local/cuda/lib64/stubs/libcuda.so /usr/local/cuda/lib64/stubs/libcuda.so.1 && \
ln -s /usr/local/cuda/nvvm/libdevice/libdevice.compute_50.10.bc /usr/local/cuda/nvvm/libdevice/libdevice.10.bc && \
LD_LIBRARY_PATH=/usr/local/cuda/lib64/stubs:${LD_LIBRARY_PATH} \
tensorflow/tools/ci_build/builds/configured GPU \
bazel build -c opt --copt=-mavx --config=cuda \
--cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" \
    tensorflow/tools/pip_package/build_pip_package && \
    rm /usr/local/cuda/lib64/stubs/libcuda.so.1 

RUN bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/pip 

RUN pip --no-cache-dir install --upgrade /tmp/pip/tensorflow-*.whl && \
rm -rf /tmp/pip && \
rm -rf /root/.cache
# Clean up pip wheel and Bazel cache when done.

WORKDIR /root

# TensorBoard
EXPOSE 6006
# IPython
EXPOSE 8888

CMD [ "/bin/bash" ]

tf11cuda8.log-附加到github问题的日志（太长了，无法在此处发布）

tensorflow 1.11和cuda 8

0 个答案: