nvcc -D_DEBUG --use_fast_math -I"/usr/local/cuda-9.0//include" -I"/usr/include/eigen3" -I"/home/xingfu/NVIDIA_CUDA-9.0_Samples/common/inc" -dlink --machine 64 -arch=sm_50 -c -o kernel_cuda.o ../CudaTest/kernel.cu
g++ -c -pipe -g -std=gnu++11 -Wall -W -D_REENTRANT -fPIC -DQT_DEPRECATED_WARNINGS -DQT_QML_DEBUG -DQT_CORE_LIB -I../CudaTest -I. -I/usr/local/cuda-9.0/include -isystem /usr/include/eigen3 -I../NVIDIA_CUDA-9.0_Samples/common/inc -isystem /usr/local/include -I../Qt5.11.0/5.11.0/gcc_64/include -I../Qt5.11.0/5.11.0/gcc_64/include/QtCore -I. -I../Qt5.11.0/5.11.0/gcc_64/mkspecs/linux-g++ -o LBDM.o ../CudaTest/LBDM.cpp
以上两个步骤已通过,但是,运行以下步骤时,发生了错误:
g++ -Wl,-rpath,/home/xingfu/Qt5.11.0/5.11.0/gcc_64/lib -o CudaTest kernel_cuda.o LBDM.o -L/usr/local/cuda-9.0//lib64/ -lcuda -lcudart -lcublas -L/home/xingfu/CudaTest/../../../usr/local/lib/ -lopencv_core -lopencv_highgui -lopencv_imgproc -lopencv_imgcodecs -L/home/xingfu/Qt5.11.0/5.11.0/gcc_64/lib -lQt5Core -lpthread
编译器错误显示:
kernel_cuda.o: In function `__sti____cudaRegisterAll()':
tmpxft_00000e7d_00000000-5_kernel.cudafe1.cpp:(.text+0x177e): undefined reference to `__cudaRegisterLinkedBinary_41_tmpxft_00000e7d_00000000_6_kernel_cpp1_ii_channel'
如何解决该错误?
此外, 我添加了-dlink,因为在执行以下步骤时会显示错误:
nvcc -D_DEBUG --use_fast_math -I"/usr/local/cuda-9.0//include" -I"/usr/include/eigen3" -I"/home/xingfu/NVIDIA_CUDA-9.0_Samples/common/inc" --machine 64 -arch=sm_50 -c -o kernel_cuda.o ../CudaTest/kernel.cu
,错误是:
ptxas fatal : Unresolved extern function 'cublasCreate_v2'
但是,当我添加-dlink时,发生了如上所述的错误。
顺便说一句,在添加-dlink之前,我可以在另一个测试项目中运行一个简单的函数,如下所示:
__global__ void add(float* x, float * y, float* z, int n)
{
int index = threadIdx.x + blockIdx.x * blockDim.x;
int stride = blockDim.x * gridDim.x;
for (int i = index; i < n; i += stride)
{
z[i] = x[i] + y[i];
}
}
添加-dlink后,测试项目显示错误:
cuda_code_cuda.o: In function `__sti____cudaRegisterAll()':
tmpxft_000017db_00000000-5_cuda_code.cudafe1.cpp:(.text+0x861): undefined reference to `__cudaRegisterLinkedBinary_44_tmpxft_000017db_00000000_6_cuda_code_cpp1_ii_5b538d80'
与上述错误非常相似。
答案 0 :(得分:1)
对于relocatable device code linking,这似乎是您追求的目标,建议的顺序如下。此外,您的代码似乎正在尝试使用cublas设备界面,因此,为使您更好,我们将这些库添加到链接步骤:
#replace -dlink -c with -dc
nvcc -D_DEBUG --use_fast_math -I"/usr/local/cuda-9.0//include" -I"/usr/include/eigen3" -I"/home/xingfu/NVIDIA_CUDA-9.0_Samples/common/inc" -dc --machine 64 -arch=sm_50 -o kernel_cuda.o ../CudaTest/kernel.cu
#generate device-linked object with cublas device libraries
nvcc -D_DEBUG --use_fast_math -dlink --machine 64 -arch=sm_50 -o kernel_dlink.o kernel_cuda.o -lcublas -lcublas_device -lcudadevrt
#no change to this line
g++ -c -pipe -g -std=gnu++11 -Wall -W -D_REENTRANT -fPIC -DQT_DEPRECATED_WARNINGS -DQT_QML_DEBUG -DQT_CORE_LIB -I../CudaTest -I. -I/usr/local/cuda-9.0/include -isystem /usr/include/eigen3 -I../NVIDIA_CUDA-9.0_Samples/common/inc -isystem /usr/local/include -I../Qt5.11.0/5.11.0/gcc_64/include -I../Qt5.11.0/5.11.0/gcc_64/include/QtCore -I. -I../Qt5.11.0/5.11.0/gcc_64/mkspecs/linux-g++ -o LBDM.o ../CudaTest/LBDM.cpp
#add device-linked object to final link phase plus cublas device libraries
g++ -Wl,-rpath,/home/xingfu/Qt5.11.0/5.11.0/gcc_64/lib -o CudaTest kernel_cuda.o LBDM.o kernel_dlink.o -L/usr/local/cuda-9.0//lib64/ -lcuda -lcudart -lcublas -lcublas_device -lcudadevrt -L/home/xingfu/CudaTest/../../../usr/local/lib/ -lopencv_core -lopencv_highgui -lopencv_imgproc -lopencv_imgcodecs -L/home/xingfu/Qt5.11.0/5.11.0/gcc_64/lib -lQt5Core -lpthread