安装TensorFlow进行对象检测有时很烦人,尤其是在通过微调预训练的模型开始自己的对象检测项目后发生有线错误时。
如何正确安装最新的Tensorflow GPU支持和最新的CUDA / CUDNN?
答案 0 :(得分:0)
sudo apt-get update
sudo apt-get upgrade
sudo apt-get install vim curl python-dev gnupg-curl python-tk git
curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
sudo -H python get-pip.py
sudo -H pip install tensorflow-gpu
// Please turn off your secure boot from BIOS
sudo apt-get install gnupg-curl
// Here we install version 10.0 to avoid other issues. Later we can upgrade it to version 10.1
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_10.0.130-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1604_10.0.130-1_amd64.deb
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/7fa2af80.pub
sudo apt-get update
wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64/nvidia-machine-learning-repo-ubuntu1604_1.0.0-1_amd64.deb
sudo apt install ./nvidia-machine-learning-repo-ubuntu1604_1.0.0-1_amd64.deb
sudo apt-get update
sudo apt-get install --no-install-recommends cuda-10-0
// restart your computer here
nvidia-smi.
sudo apt-get install --no-install-recommends libcudnn7=7.6.3.30-1+cuda10.0 libcudnn7-dev=7.6.3.30-1+cuda10.0
sudo apt-get install -y --no-install-recommends libnvinfer5=5.1.5-1+cuda10.0 libnvinfer-dev=5.1.5-1+cuda10.0
sudo apt-get update
// this upgrade command will upgrade your CUDA to version 10.1
sudo apt-get upgrade
sudo apt-get autoremove
sudo -H pip install Cython
sudo -H pip install contextlib2
sudo -H pip install pillow
sudo -H pip install lxml
sudo -H pip install jupyter
sudo -H pip install matplotlib
mkdir tensorflow
cd tensorflow
git clone https://github.com/tensorflow/models
// install protocbuf version 3.0.0
wget -O protobuf.zip https://github.com/google/protobuf/releases/download/v3.0.0/protoc-3.0.0-linux-x86_64.zip
unzip protobuf.zip
sudo cp ./bin/protoc /bin/
sudo cp -r ./include/google /usr/local/include/
cd tensorflow/models/research
protoc object_detection/protos/*.proto --python_out=.
git clone https://github.com/cocodataset/cocoapi.git
cd cocoapi/PythonAPI
make
cp -r pycocotools <path_to_tensorflow>/models/research/
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-10.1/lib64
export PYTHONPATH=$PYTHONPATH:~/tensorflow/models/research:~/tensorflow/models/research/object_detection/slim
If we install CUDA 10.1 it may have a compatibility issue that libxxxx.so.10.0 is not found when you start training your project.
To solve them:
a. sudo ln -s /usr/lib/x86_64-linux-gnu/libcublas.so.10.1 /usr/local/cuda-10.1/lib64/libcublas.so.10.0
b. sudo ln -s /usr/local/cuda-10.1/lib64/libcudart.so.10.1 /usr/local/cuda-10.1/lib64/libcudart.so.10.0
c. sudo ln -s /usr/local/cuda-10.1/lib64/libcufft.so.10 /usr/local/cuda-10.1/lib64/libcufft.so.10.0
d. sudo ln -s /usr/local/cuda-10.1/lib64/libcurand.so.10 /usr/local/cuda-10.1/lib64/libcurand.so.10.0
e. sudo ln -s /usr/local/cuda-10.1/lib64/libcusolver.so.10 /usr/local/cuda-10.1/lib64/libcusolver.so.10.0
f. sudo ln -s /usr/local/cuda-10.1/lib64/libcusparse.so.10 /usr/local/cuda-10.1/lib64/libcusparse.so.10.0