我已经从源代码编译了OpenCV(2.4.6.1),在OS X Mavericks上启用了CUDA(6.0)(10.9.3)。
现在我想使用OpenCV和CUDA的混合创建自己的图像处理功能。让我们举一个简单的例子,我们有一个OpenCV Mat,并希望对每个元素做一些事情,并希望通过使用CUDA并行化来加快速度。在我们的例子中,我们只打印出每个Mat元素的值。不现实,但足以表明这个概念。
CUDA标头文件: print.cuh
#ifndef __PRINT_CUH__
#define __PRINT_CUH__
void print(const unsigned char * pixels, const int N);
#endif
CUDA源文件: print.cu
#include <stdio.h>
// The device version
__global__ void cuda_print(const unsigned char * pixels, const int N)
{
int tidX = blockIdx.x * blockDim.x + threadIdx.x;
if( tidX >= N ) {
return;
}
printf("pixel value @ %d = %d\n", tidX, pixels[tidX]);
}
// The host version
void print(const unsigned char * pixels, const int N) {
int num_blocks = 10;
int num_threads = 128;
unsigned char * d_pixels;
cudaMalloc( &d_pixels, sizeof(char) * N );
cudaMemcpy( d_pixels, pixels, sizeof(char)*N, cudaMemcpyHostToDevice);
cuda_print<<<num_blocks, num_threads>>>(d_pixels, N);
cudaDeviceSynchronize(); // The above call is asynchronous, wait until it
// finishes before exiting the program!
}
包含OpenCV和我们自己的CUDA代码的C ++代码: main.cpp
#include <opencv2/opencv.hpp>
#include "print.cuh"
int main(int argc, char ** argv )
{
cv::Mat m(100,1,CV_8UC1, cv::Scalar(0));
print(m.ptr(0), m.rows);
return 0;
}
我们希望将自己的CUDA代码编译到共享库中,并将其包含在我们的主
中 CMAKE设置: CMakeLists.txt
# CUDA CMAKE TEST
cmake_minimum_required(VERSION 2.8)
# project name
project(CUDA_CMAKE)
# find dependencies
find_package(OpenCV REQUIRED)
find_package(CUDA REQUIRED)
# this is necessary on OS X since CUDA only support the older libstdc++
IF(APPLE)
SET(CUDA_HOST_COMPILER /usr/bin/clang CACHE FILEPATH "Setting clang as the CUDA compiler" FORCE)
SET(CUDA_NVCC_FLAGS "-Xcompiler -stdlib=libstdc++; -Xlinker -stdlib=libstdc++; -arch=sm_20" CACHE STRING "Setting NVCC compiler flags" FORCE)
ENDIF()
# build a shared library with our CUDA code
CUDA_ADD_LIBRARY(cudaPrint
SHARED
print.cu
)
TARGET_LINK_LIBRARIES(cudaPrint
${CUDA_LIBRARIES}
)
# build the C++ code and link with the CUDA code
ADD_EXECUTABLE(cuda_test
main.cpp
)
TARGET_LINK_LIBRARIES(cuda_test
cudaPrint ${OpenCV_LIBS}
)
构建的第一步工作正常,并生成cudaPrint.dylib。 但是,在尝试构建可执行文件时,我收到以下链接错误:
make all
-- Configuring done
CMake Warning at CMakeLists.txt:29 (ADD_EXECUTABLE):
Cannot generate a safe runtime search path for target cuda_test because
there is a cycle in the constraint graph:
dir 0 is [/Developer/NVIDIA/CUDA-5.5/lib]
dir 1 must precede it due to runtime library [libcudart.dylib]
dir 1 is [/usr/local/cuda/lib]
dir 0 must precede it due to runtime library [libcudart.dylib]
Some of these libraries may not be found correctly.
-- Generating done
-- Build files have been written to: /Users/navid/proj/CUDA/test_cuda_opencv/build
[ 50%] Built target cudaPrint
Linking CXX executable cuda_test
ld: can't map file, errno=22 file '/Developer/NVIDIA/CUDA-5.5/lib' for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [cuda_test] Error 1
make[1]: *** [CMakeFiles/cuda_test.dir/all] Error 2
make: *** [all] Error 2
看起来这个错误与OpenCV有关,还包括CUDA库。我不确定,但我有一个解决这个问题,我将在下面发布。
答案 0 :(得分:3)
所以看起来链接错误是由于libopencv_ts
通过${OpenCV_LIBS}
传递给链接器,因为FIND_PACKAGE(OpenCV REQUIRED)
将所有OpenCV库添加到${OpenCV_LIBS}
变量。
如果libopencv_ts
不需要,一个简单的解决方法就是在我们要求CMAKE
找到包时指定我们明确需要的OpenCV库,例如: FIND_PACKAGE(OpenCV REQUIRED COMPONENTS core highgui cuda)
。
我不知道为什么libopencv_ts
正在创建此循环错误以及如何绕过它。