使用具有nvptx目标的Clang 6(主干)编译OpenMP`target`指令时的链接器错误

时间:2017-09-10 20:10:47

标签: cuda openmp llvm-clang

我想使用LLVM / Clang在我的Nvidia GPU上运行OpenMP4 +代码。我按照此处的说明从trunk下载并编译了llvm / clang / omp库:https://clang.llvm.org/get_started.html和此处:https://openmp.llvm.org/。我没有构建Compiler-RT和libcxx,但我认为这没有任何区别。

我的CMake命令是:architecture behavior of Counter is signal q: std_logic_vector(7 downto 0); begin process(clock, choose) begin if clear = '1' then q <= q - q; else if rising_edge(clock) then -- when choose is '1', the process if for increment if(choose = '1') then case incodec is when "001" => q <= q + 1; when "011" => q <= q + 10; when "111" => q <= q + 11; when others => q <= q; end case; -- when choose is '0', the process if for decrement elsif choose = '0' then case incodec is when "001" => q <= q - 1; when "011" => q <= q - 10; when "111" => q <= q - 11; when others => q <= q; end case; end if; end if; end if; case q(7 downto 4) is -- 6543210 when "0000" => hex7 <= "1000000"; --0 when "0001" => hex7 <= "1111001"; --1 when "0010" => hex7 <= "0100100"; --2 when "0011" => hex7 <= "0110000"; --3 when "0100" => hex7 <= "0011001"; --4 when "0101" => hex7 <= "0010010"; --5 when "0110" => hex7 <= "0000010"; --6 when "0111" => hex7 <= "1111000"; --7 when "1000" => hex7 <= "0000000"; --8 when "1001" => hex7 <= "0010000"; --9 when "1010" => hex7 <= "0001000"; --10/A when "1011" => hex7 <= "0000011"; --11/B/b when "1100" => hex7 <= "1000110"; --12/C when "1101" => hex7 <= "0100001"; --13/D/d when "1110" => hex7 <= "0000110"; --14/E when "1111" => hex7 <= "0001110"; --15/F when others => hex7 <= "0111111"; -- - end case; case q(3 downto 0) is -- 6543210 when "0000" => hex6 <= "1000000"; --0 when "0001" => hex6 <= "1111001"; --1 when "0010" => hex6 <= "0100100"; --2 when "0011" => hex6 <= "0110000"; --3 when "0100" => hex6 <= "0011001"; --4 when "0101" => hex6 <= "0010010"; --5 when "0110" => hex6 <= "0000010"; --6 when "0111" => hex6 <= "1111000"; --7 when "1000" => hex6 <= "0000000"; --8 when "1001" => hex6 <= "0010000"; --9 when "1010" => hex6 <= "0001000"; --10/A when "1011" => hex6 <= "0000011"; --11/B/b when "1100" => hex6 <= "1000110"; --12/C when "1101" => hex6 <= "0100001"; --13/D/d when "1110" => hex6 <= "0000110"; --14/E when "1111" => hex6 <= "0001110"; --15/F when others => hex6 <= "0111111"; -- - end case; end behavior

我用一个OpenMP cmake -G "Unix Makefiles" ../llvm -DCMAKE_BUILD_TYPE=Release -DOPENMP_ENABLE_LIBOMPTARGET=ON指令编写了一个非常基本的程序:

target

我用这个编译它: int main(void) { #pragma omp target { } return 0; }

如果你问:是的,我还没有为编译器设置路径,但我确保/home/user/opt/llvm/bin/clang++ -v main.cpp -fopenmp -lomptarget -fopenmp-targets=nvptx64-nvidia-cuda --cuda-path=/home/user/opt/pgi/linux86-64/2017/cuda/8.0指向LD_LIBRARY_PATH所在的位置。

这是执行上述命令后得到的输出/错误(最后~10行显示错误):

libomptarget

知道哪个库应该包含这些<If this is too much information, just go to the last 10 lines to see the error> clang version 6.0.0 (trunk 312875) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /home/user/opt/llvm/bin Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/6 Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/6.4.0 Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/7 Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/7.2.0 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8.4 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9.3 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/5 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/5.4.1 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/6 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/6.4.0 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7.2.0 Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7.2.0 Candidate multilib: .;@m64 Selected multilib: .;@m64 Found CUDA installation: /home/user/opt/pgi/linux86-64/2017/cuda/8.0, version 7.0 "/home/user/opt/llvm/bin/clang-6.0" -cc1 -triple x86_64-unknown-linux-gnu -emit-llvm-bc -emit-llvm-uselists -disable-free -disable-llvm-verifier -discard-value-names -main-file-name main.cpp -mrelocation-model static -mthread-model posix -mdisable-fp-elim -fmath-errno -masm-verbose -mconstructor-aliases -munwind-tables -fuse-init-array -target-cpu x86-64 -dwarf-column-info -debugger-tuning=gdb -v -resource-dir /home/user/opt/llvm/lib/clang/6.0.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/x86_64-linux-gnu/c++/7.2.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/x86_64-linux-gnu/c++/7.2.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0/backward -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/x86_64-linux-gnu/c++/7.2.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/x86_64-linux-gnu/c++/7.2.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0/backward -internal-isystem /usr/local/include -internal-isystem /home/user/opt/llvm/lib/clang/6.0.0/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -internal-isystem /usr/local/include -internal-isystem /home/user/opt/llvm/lib/clang/6.0.0/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -fdeprecated-macro -fdebug-compilation-dir /tmp -ferror-limit 19 -fmessage-length 190 -fopenmp -fobjc-runtime=gcc -fcxx-exceptions -fexceptions -fdiagnostics-show-option -fcolor-diagnostics -o /tmp/main-be2d35.bc -x c++ main.cpp -fopenmp-targets=nvptx64-nvidia-cuda clang -cc1 version 6.0.0 based upon LLVM 6.0.0svn default target x86_64-unknown-linux-gnu ignoring nonexistent directory "/include" ignoring nonexistent directory "/include" ignoring duplicate directory "/usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/x86_64-linux-gnu/c++/7.2.0" ignoring duplicate directory "/usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0" ignoring duplicate directory "/usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/x86_64-linux-gnu/c++/7.2.0" ignoring duplicate directory "/usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/x86_64-linux-gnu/c++/7.2.0" ignoring duplicate directory "/usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0/backward" ignoring duplicate directory "/usr/local/include" ignoring duplicate directory "/home/user/opt/llvm/lib/clang/6.0.0/include" ignoring duplicate directory "/usr/include/x86_64-linux-gnu" ignoring duplicate directory "/usr/include" #include "..." search starts here: #include <...> search starts here: /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0 /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/x86_64-linux-gnu/c++/7.2.0 /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0/backward /usr/local/include /home/user/opt/llvm/lib/clang/6.0.0/include /usr/include/x86_64-linux-gnu /usr/include End of search list. "/home/user/opt/llvm/bin/clang-6.0" -cc1 -triple nvptx64-nvidia-cuda -aux-triple x86_64-unknown-linux-gnu -S -disable-free -disable-llvm-verifier -discard-value-names -main-file-name main.cpp -mrelocation-model pic -pic-level 2 -mthread-model posix -mdisable-fp-elim -fmath-errno -no-integrated-as -fuse-init-array -mlink-cuda-bitcode /home/user/opt/pgi/linux86-64/2017/cuda/8.0/nvvm/libdevice/libdevice.compute_20.10.bc -target-feature +ptx42 -target-cpu sm_20 -dwarf-column-info -debugger-tuning=gdb -v -resource-dir /home/user/opt/llvm/lib/clang/6.0.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/x86_64-linux-gnu/c++/7.2.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/x86_64-linux-gnu/c++/7.2.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0/backward -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/x86_64-linux-gnu/c++/7.2.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/x86_64-linux-gnu/c++/7.2.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0/backward -internal-isystem /usr/local/include -internal-isystem /home/user/opt/llvm/lib/clang/6.0.0/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -internal-isystem /usr/local/include -internal-isystem /home/user/opt/llvm/lib/clang/6.0.0/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -fdeprecated-macro -fno-dwarf-directory-asm -fdebug-compilation-dir /tmp -ferror-limit 19 -fmessage-length 190 -fopenmp -fobjc-runtime=gcc -fcxx-exceptions -fexceptions -fdiagnostics-show-option -fcolor-diagnostics -o /tmp/main-7ffbd7.s -x c++ main.cpp -fopenmp-is-device -fopenmp-host-ir-file-path /tmp/main-be2d35.bc clang -cc1 version 6.0.0 based upon LLVM 6.0.0svn default target x86_64-unknown-linux-gnu ignoring nonexistent directory "/include" ignoring nonexistent directory "/include" ignoring duplicate directory "/usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/x86_64-linux-gnu/c++/7.2.0" ignoring duplicate directory "/usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0" ignoring duplicate directory "/usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/x86_64-linux-gnu/c++/7.2.0" ignoring duplicate directory "/usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/x86_64-linux-gnu/c++/7.2.0" ignoring duplicate directory "/usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0/backward" ignoring duplicate directory "/usr/local/include" ignoring duplicate directory "/home/user/opt/llvm/lib/clang/6.0.0/include" ignoring duplicate directory "/usr/include/x86_64-linux-gnu" ignoring duplicate directory "/usr/include" #include "..." search starts here: #include <...> search starts here: /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0 /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/x86_64-linux-gnu/c++/7.2.0 /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0/backward /usr/local/include /home/user/opt/llvm/lib/clang/6.0.0/include /usr/include/x86_64-linux-gnu /usr/include End of search list. "/home/user/opt/pgi/linux86-64/2017/cuda/8.0/bin/ptxas" -m64 -O0 -v --gpu-name sm_20 --output-file /tmp/main-64fc86.cubin /tmp/main-ca9e59.s -c ptxas info : 1 bytes gmem, 8 bytes cmem[14] ptxas info : Compiling entry function '__omp_offloading_803_18004c0_main_l3' for 'sm_20' ptxas info : Function properties for __omp_offloading_803_18004c0_main_l3 0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads ptxas info : Used 5 registers, 32 bytes cmem[0] ptxas info : Function properties for __omp_offloading_803_18004c0_main_l3_worker 24 bytes stack frame, 8 bytes spill stores, 8 bytes spill loads "/home/user/opt/pgi/linux86-64/2017/cuda/8.0/bin/nvlink" -o /tmp/main-f247e3.out -v -arch sm_20 -L/home/user/opt/llvm/lib -lomptarget-nvptx main-64fc86.cubin nvlink error : Undefined reference to '__kmpc_kernel_init' in 'main-64fc86.cubin' nvlink error : Undefined reference to '__kmpc_kernel_deinit' in 'main-64fc86.cubin' nvlink error : Undefined reference to '__kmpc_kernel_parallel' in 'main-64fc86.cubin' nvlink error : Undefined reference to '__kmpc_kernel_end_parallel' in 'main-64fc86.cubin' nvlink info : 1 bytes gmem, 8 bytes cmem[14] nvlink info : Function properties for '__omp_offloading_803_18004c0_main_l3': nvlink info : used 18 registers, 24 stack, 0 bytes smem, 32 bytes cmem[0], 0 bytes lmem clang-6.0: error: fatbinary command failed with exit code 255 (use -v to see invocation) 符号?  我试着跑:

__kmpc*

nm libomptarget.so | grep __kmpc_kernel_parallel

但两个命令都没有返回任何内容。

最后,如果我从编译标志中删除nm libomptarget.rtl.cuda.so | grep __kmpc_kernel_parallel,我就不会收到链接器错误。但当然在这种情况下不会生成CUDA代码。

任何可以帮助我弄清楚发生了什么的反馈,应该找到这些符号的位置,以及它们不存在的原因,都非常受欢迎。

1 个答案:

答案 0 :(得分:2)

不幸的是,针对nvptx设备的OMP目标卸载支持还没有被上流。 github上有一个最近的分支,在wiki上有构建说明 https://github.com/clang-ykt/clang/wiki