如何在OSX上编译caffe_rtpose?

时间:2017-03-26 17:02:19

标签: c++ clang openmp caffe cudnn

我最近发现了caffe_rtpose,我试图编译并运行该示例。不幸的是,我对c ++非常有经验,所以我遇到了很多编译和链接的问题。

我尝试调整Makefile配置(从existing Ubuntu config修改)。 (我正在使用运行OSX 10.11.5和nVidia GeForce 750M的系统,我安装了CUDA 7.5和libcudnn):

## Refer to http://caffe.berkeleyvision.org/installation.html
# Contributions simplifying and improving our build system are welcome!

# cuDNN acceleration switch (uncomment to build with cuDNN).
USE_CUDNN := 1

# CPU-only switch (uncomment to build without GPU support).
# CPU_ONLY := 1

# uncomment to disable IO dependencies and corresponding data layers
# USE_OPENCV := 0
# USE_LEVELDB := 0
# USE_LMDB := 0

# uncomment to allow MDB_NOLOCK when reading LMDB files (only if necessary)
#   You should not set this flag if you will be reading LMDBs with any
#   possibility of simultaneous read and write
# ALLOW_LMDB_NOLOCK := 1

# Uncomment if you're using OpenCV 3
# OPENCV_VERSION := 3

# To customize your choice of compiler, uncomment and set the following.
# N.B. the default for Linux is g++ and the default for OSX is clang++
# CUSTOM_CXX := g++

# CUDA directory contains bin/ and lib/ directories that we need.
CUDA_DIR := /usr/local/cuda
# On Ubuntu 14.04, if cuda tools are installed via
# "sudo apt-get install nvidia-cuda-toolkit" then use this instead:
# CUDA_DIR := /usr

# CUDA architecture setting: going with all of them.
# For CUDA < 6.0, comment the *_50 lines for compatibility.
CUDA_ARCH := -gencode arch=compute_30,code=sm_30 \
        -gencode arch=compute_35,code=sm_35 \
        -gencode arch=compute_50,code=sm_50 \
        -gencode arch=compute_50,code=compute_50 \
        -gencode arch=compute_52,code=sm_52 \
        # -gencode arch=compute_60,code=sm_60 \
        # -gencode arch=compute_61,code=sm_61
# Deprecated
#CUDA_ARCH := -gencode arch=compute_20,code=sm_20 \
#       -gencode arch=compute_20,code=sm_21 \
#       -gencode arch=compute_30,code=sm_30 \
#       -gencode arch=compute_35,code=sm_35 \
#       -gencode arch=compute_50,code=sm_50 \
#       -gencode arch=compute_50,code=compute_50

# BLAS choice:
# atlas for ATLAS (default)
# mkl for MKL
# open for OpenBlas
BLAS := atlas
# Custom (MKL/ATLAS/OpenBLAS) include and lib directories.
# Leave commented to accept the defaults for your choice of BLAS
# (which should work)!
# BLAS_INCLUDE := /path/to/your/blas
# BLAS_LIB := /path/to/your/blas

# Homebrew puts openblas in a directory that is not on the standard search path
# BLAS_INCLUDE := $(shell brew --prefix openblas)/include
# BLAS_LIB := $(shell brew --prefix openblas)/lib
BLAS_INCLUDE := /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/Headers/
BLAS_LIB := /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A

# This is required only if you will compile the matlab interface.
# MATLAB directory should contain the mex binary in /bin.
# MATLAB_DIR := /usr/local
# MATLAB_DIR := /Applications/MATLAB_R2012b.app

# NOTE: this is required only if you will compile the python interface.
# We need to be able to find Python.h and numpy/arrayobject.h.
PYTHON_INCLUDE := /usr/include/python2.7 \
        /usr/lib/python2.7/dist-packages/numpy/core/include
# Anaconda Python distribution is quite popular. Include path:
# Verify anaconda location, sometimes it's in root.
# ANACONDA_HOME := $(HOME)/anaconda
# PYTHON_INCLUDE := $(ANACONDA_HOME)/include \
        # $(ANACONDA_HOME)/include/python2.7 \
        # $(ANACONDA_HOME)/lib/python2.7/site-packages/numpy/core/include \

# We need to be able to find libpythonX.X.so or .dylib.
PYTHON_LIB := /usr/lib
# PYTHON_LIB := $(ANACONDA_HOME)/lib

# Homebrew installs numpy in a non standard path (keg only)
# PYTHON_INCLUDE += $(dir $(shell python -c 'import numpy.core; print(numpy.core.__file__)'))/include
# PYTHON_LIB += $(shell brew --prefix numpy)/lib

# Uncomment to support layers written in Python (will link against Python libs)
# WITH_PYTHON_LAYER := 1

# Whatever else you find you need goes here.
# INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial
# LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu /usr/lib/x86_64-linux-gnu/hdf5/serial

INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include
LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib

# If Homebrew is installed at a non standard location (for example your home directory) and you use it for general dependencies
# INCLUDE_DIRS += $(shell brew --prefix)/include
# LIBRARY_DIRS += $(shell brew --prefix)/lib

# Uncomment to use `pkg-config` to specify OpenCV library paths.
# (Usually not necessary -- OpenCV libraries are normally installed in one of the above $LIBRARY_DIRS.)
# USE_PKG_CONFIG := 1

BUILD_DIR := build
DISTRIBUTE_DIR := distribute

# Uncomment for debugging. Does not work on OSX due to https://github.com/BVLC/caffe/issues/171
# DEBUG := 1

# The ID of the GPU that 'make runtest' will use to run unit tests.
TEST_GPUID := 0

# enable pretty build (comment to see full commands)
# Q ?= @

这是install_caffe_and_cpm_osx.sh脚本的修改版本:

#!/bin/bash



echo "------------------------- INSTALLING CAFFE AND CPM -------------------------"
echo "NOTE: This script assumes that CUDA and cuDNN are already installed on your machine. Otherwise, it might fail."



function exitIfError {
    if [[ $? -ne 0 ]] ; then
        echo ""
        echo "------------------------- -------------------------"
        echo "Errors detected. Exiting script. The software might have not been successfully installed."
        echo "------------------------- -------------------------"
        exit 1
    fi
}



# echo "------------------------- Checking Ubuntu Version -------------------------"
# ubuntu_version="$(lsb_release -r)"
# echo "Ubuntu $ubuntu_version"
# if [[ $ubuntu_version == *"14."* ]]; then
#     ubuntu_le_14=true
# elif [[ $ubuntu_version == *"16."* || $ubuntu_version == *"15."* || $ubuntu_version == *"17."* || $ubuntu_version == *"18."* ]]; then
#     ubuntu_le_14=false
# else
#     echo "Ubuntu release older than version 14. This installation script might fail."
#     ubuntu_le_14=true
# fi
# exitIfError
# echo "------------------------- Ubuntu Version Checked -------------------------"
# echo ""



echo "------------------------- Checking Number of Processors -------------------------"
NUM_CORES=$(grep -c ^processor /proc/cpuinfo 2>/dev/null || sysctl -n hw.ncpu)
echo "$NUM_CORES cores"
exitIfError
echo "------------------------- Number of Processors Checked -------------------------"
echo ""



echo "------------------------- Installing some Caffe Dependencies -------------------------"
# Basic
# sudo apt-get --assume-yes update
# sudo apt-get --assume-yes install build-essential
#General dependencies
brew install protobuf leveldb snappy hdf5
# with Python pycaffe needs dependencies built from source - from http://caffe.berkeleyvision.org/install_osx.html
# brew install --build-from-source --with-python -vd protobuf
# brew install --build-from-source -vd boost boost-python
# without Python the usual installation suffices
brew install boost
# sudo apt-get --assume-yes install libprotobuf-dev libleveldb-dev libsnappy-dev libhdf5-serial-dev protobuf-compiler
# sudo apt-get --assume-yes install --no-install-recommends libboost-all-dev
# Remaining dependencies, 14.04
brew install gflags glog lmdb
# if [[ $ubuntu_le_14 == true ]]; then
#     sudo apt-get --assume-yes install libgflags-dev libgoogle-glog-dev liblmdb-dev
# fi
# OpenCV 2.4
# sudo apt-get --assume-yes install libopencv-dev
exitIfError
echo "------------------------- Some Caffe Dependencies Installed -------------------------"
echo ""



echo "------------------------- Compiling Caffe & CPM -------------------------"
cp Makefile.config.OSX.10.11.5.example Makefile.config
make all -j$NUM_CORES
# make test -j$NUM_CORES
# make runtest -j$NUM_CORES
exitIfError
echo "------------------------- Caffe & CPM Compiled -------------------------"
echo ""


# echo "------------------------- Installing CPM -------------------------"
# echo "Compiled"
# exitIfError
# echo "------------------------- CPM Installed -------------------------"
# echo ""



echo "------------------------- Downloading CPM Models -------------------------"
models_folder="./model/"
# COCO
coco_folder="$models_folder"coco/""
coco_model="$coco_folder"pose_iter_440000.caffemodel""
if [ ! -f $coco_model ]; then
    wget http://posefs1.perception.cs.cmu.edu/Users/tsimon/Projects/coco/data/models/coco/pose_iter_440000.caffemodel -P $coco_folder
fi
exitIfError
# MPI
mpi_folder="$models_folder"mpi/""
mpi_model="$mpi_folder"pose_iter_160000.caffemodel""
if [ ! -f $mpi_model ]; then
    wget http://posefs1.perception.cs.cmu.edu/Users/tsimon/Projects/coco/data/models/mpi/pose_iter_160000.caffemodel -P $mpi_folder
fi
exitIfError
echo "Models downloaded"
echo "------------------------- CPM Models Downloaded -------------------------"
echo ""



echo "------------------------- CAFFE AND CPM INSTALLED -------------------------"
echo ""

但是我收到了这个错误:

examples/rtpose/rtpose.cpp:1088:22: error: variable length array of non-POD element type 'Frame'
    Frame frame_batch[BATCH_SIZE];

我尝试将数组换成矢量:

std::vector<Frame> frame_batch;
    std::cout << "allocating " << BATCH_SIZE << " frames" << std::endl;
    frame_batch.reserve(BATCH_SIZE);

这似乎解决了编译错误,但现在我收到链接器错误:     ld:找不到-lgomp的库     clang:错误:链接器命令失败,退出代码为1(使用-v查看调用)

我搜索了lib lib gomp并在caffe和OpenMP上找到了一些相关的帖子,提到了OSX和OpenMP上的clang问题。 我尝试了什么:

  1. 关注this post我用自制软件安装了gcc 4.9(作为gcc 5安装5.9的自制公式,这可能太高了?)
  2. 我根据Andrey Bokhanko's answer设置了-fopenmp=libomp:这对我不起作用++-4.9: error: unrecognized command line option '-fopenmp=libomp'
  3. 我可以使用official instructions单独下载和构建Caffe,但我似乎无法弄清楚如何编译这个非常棒的演示。 不幸的是我对c ++和OpenMP没有经验,所以我真的可以在这里使用你的建议。谢谢

    更新:我尝试过Mark Setchell通过clang安装llvm的有用建议。我已经更新了Makefile配置以使用

    CUSTOM_CXX := /usr/local/opt/llvm/bin/clang++
    

    但是CUDA不喜欢它:

    nvcc fatal   : The version ('30801') of the host compiler ('clang') is not supported
    

    我尝试使用CPU_ONLY进行编译,但仍然会收到CUDA错误:

    examples/rtpose/rtpose.cpp:235:5: error: use of undeclared identifier 'cudaMalloc'
        cudaMalloc(&net_copies[device_id].canvas, DISPLAY_RESOLUTION_WIDTH * DISPLAY_RESOLUTION_HEIGHT * 3 * sizeof(float));
        ^
    examples/rtpose/rtpose.cpp:236:5: error: use of undeclared identifier 'cudaMalloc'
        cudaMalloc(&net_copies[device_id].joints, MAX_NUM_PARTS*3*MAX_PEOPLE * sizeof(float) );
        ^
    examples/rtpose/rtpose.cpp:1130:146: error: use of undeclared identifier 'cudaMemcpyHostToDevice'
                    cudaMemcpy(net_copies[tid].canvas, frame.data_for_mat, DISPLAY_RESOLUTION_WIDTH * DISPLAY_RESOLUTION_HEIGHT * 3 * sizeof(float), cudaMemcpyHostToDevice);
                                                                                                                                                     ^
    examples/rtpose/rtpose.cpp:1136:108: error: use of undeclared identifier 'cudaMemcpyHostToDevice'
                    cudaMemcpy(pointer + 0 * offset, frame_batch[0].data, BATCH_SIZE * offset * sizeof(float), cudaMemcpyHostToDevice);
                                                                                                               ^
    examples/rtpose/rtpose.cpp:1178:13: error: use of undeclared identifier 'cudaMemcpyHostToDevice'
                cudaMemcpyHostToDevice);
                ^
    examples/rtpose/rtpose.cpp:1192:155: error: use of undeclared identifier 'cudaMemcpyDeviceToHost'
                    cudaMemcpy(frame_batch[n].data_for_mat, net_copies[tid].canvas, DISPLAY_RESOLUTION_HEIGHT * DISPLAY_RESOLUTION_WIDTH * 3 * sizeof(float), cudaMemcpyDeviceToHost);
                                                                                                                                                              ^
    examples/rtpose/rtpose.cpp:1202:155: error: use of undeclared identifier 'cudaMemcpyDeviceToHost'
                    cudaMemcpy(frame_batch[n].data_for_mat, net_copies[tid].canvas, DISPLAY_RESOLUTION_HEIGHT * DISPLAY_RESOLUTION_WIDTH * 3 * sizeof(float), cudaMemcpyDeviceToHost);
    

    我不是专家,但快速浏览代码,我看不出CPU_ONLY版本如何与cuda调用一起使用。

    再看看caffe OSX Installation guide,我可以试试路线&gt;不是为了胆小的人

1 个答案:

答案 0 :(得分:0)

我终于设法编译了rtpose示例。

这就是我的所作所为:

如上所述,在examples / rtpose / rtpose.cpp中为一个向量交换了Frame数组:

std::vector<Frame> frame_batch;
    std::cout << "allocating " << BATCH_SIZE << " frames" << std::endl;
    frame_batch.reserve(BATCH_SIZE);

在尝试使用clang++和Homebrew安装LLVM的gcc++-4.9失败后,使用默认的clang++编译器,但删除了-fopenmp标记和{{ 1}}链接器标志,而不是编译器标志,基于this answer

编译完成后,我尝试运行它,但得到了一个与libjpeg相关的错误:

-pthread

解决方法是mdemirst's answer。我做了一个旧的符号链接的备份,以防万一。我从ImageIO.framework做了symlink libjpeg / libpng / libtiff / libgif。

我已在github上提交上述配置/设置脚本。

既然编译了这个例子,我仍然无法运行它,可能是由于GPU内存不足:

dyld: Symbol not found: __cg_jpeg_resync_to_restart
  Referenced from: /System/Library/Frameworks/ImageIO.framework/Versions/A/ImageIO
  Expected in: /usr/local/lib/libJPEG.dylib
 in /System/Library/Frameworks/ImageIO.framework/Versions/A/ImageIO
Trace/BPT trap: 5

我尝试尽可能减少设置:

F0331 02:02:16.231935 528384 syncedmem.cpp:56] Check failed: error == cudaSuccess (2 vs. 0)  out of memory
*** Check failure stack trace: ***
    @        0x10c7a89da  google::LogMessage::Fail()
    @        0x10c7a80d5  google::LogMessage::SendToLog()
    @        0x10c7a863b  google::LogMessage::Flush()
    @        0x10c7aba17  google::LogMessageFatal::~LogMessageFatal()
    @        0x10c7a8cc7  google::LogMessageFatal::~LogMessageFatal()
    @        0x1079481db  caffe::SyncedMemory::to_gpu()
    @        0x107947c9e  caffe::SyncedMemory::mutable_gpu_data()
    @        0x1079affba  caffe::CuDNNConvolutionLayer<>::Forward_gpu()
    @        0x107861331  caffe::Layer<>::Forward()
    @        0x107918016  caffe::Net<>::ForwardFromTo()
    @        0x1077a86f1  warmup()
    @        0x1077b211d  processFrame()
    @     0x7fff8b11899d  _pthread_body
    @     0x7fff8b11891a  _pthread_start
    @     0x7fff8b116351  thread_start
Abort trap: 6

但无济于事。实际上运行这个例子本身可能是另一个问题。