我正在使用TensorFlow 的docker安装。
我使用
启动容器nvidia-docker run -it -p 8888:8888 -v /*/Data/docker:/docker --name TensorFlow gcr.io/tensorflow/tensorflow:latest-gpu /bin/bash
这允许我链接文件夹名称" docker"在我的辅助本地驱动器中,在docker容器中有一个文件夹。
问题在于,每当我的计算机(Ubuntu - GTX 1070 - 6700k Intel CPU)进入休眠状态时,GPU就会变得不可用,代码只能在CPU上运行。当我在docker里面的ipython notebook会话中运行代码时,我得到:
对cuInit的调用失败:CUDA_ERROR_UNKNOWN。
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcurand.so locally
E tensorflow/stream_executor/cuda/cuda_driver.cc:491] failed call to cuInit: CUDA_ERROR_UNKNOWN
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:153] retrieving CUDA diagnostic information for host: 123456c234ds
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:160] hostname: 123456c234ds
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:185] libcuda reported version is: 367.57.0
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:356] driver version file contents: """NVRM version: NVIDIA UNIX x86_64 Kernel Module 367.57 Mon Oct 3 20:37:01 PDT 2016
GCC version: gcc version 4.9.3 (Ubuntu 4.9.3-13ubuntu2)
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] kernel reported version is: 367.57.0
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:293] kernel version seems to match DSO: 367.57.0
当我重新启动计算机时,GPU在没有UNKNOWN消息的情况下可用。
我搜索了互联网,sudo apt-get install nvidia-modprobe
等解决方案无法解决问题。