我有一个简单的C例程,它接受四个单词并返回四个单词,gcc可以优化并发出GHC不支持的一些原语。我正在尝试对调用此过程的各种方法进行基准测试,并且在尝试调整技术described here以使用foreign import prim
时遇到问题。
以下是为每个输入字添加1,但是段错误。
Main.hs:
{-# LANGUAGE GHCForeignImportPrim #-}
{-# LANGUAGE ForeignFunctionInterface #-}
{-# LANGUAGE MagicHash #-}
{-# LANGUAGE UnboxedTuples #-}
{-# LANGUAGE UnliftedFFITypes #-}
import Foreign.C
import GHC.Prim
import GHC.Int
import GHC.Word
foreign import prim "sipRound"
sipRound_c# :: Word# -> Word# -> Word# -> Word# -> (# Word#, Word#, Word#, Word# #)
sipRound_c :: Word64 -> Word64 -> Word64 -> Word64 -> (Word64, Word64, Word64, Word64)
sipRound_c (W64# v0) (W64# v1) (W64# v2) (W64# v3) = case sipRound_c# v0 v1 v2 v3 of
(# v0', v1', v2', v3' #) -> (W64# v0', W64# v1', W64# v2', W64# v3')
main = do
print $ sipRound_c 1 2 3 4
sip.c:
#include <stdlib.h>
#include <stdint.h>
#include <stdbool.h>
// define a function pointer type that matches the STG calling convention
typedef void (*HsCall)(int64_t*, int64_t*, int64_t*, int64_t, int64_t, int64_t, int64_t,
int64_t, int64_t, int64_t*, float, float, float, float, double, double);
extern void
sipRound(
int64_t* restrict baseReg,
int64_t* restrict sp,
int64_t* restrict hp,
uint64_t v0, // R1
uint64_t v1, // R2
uint64_t v2, // R3
uint64_t v3, // R4
int64_t r5,
int64_t r6,
int64_t* restrict spLim,
float f1,
float f2,
float f3,
float f4,
double d1,
double d2)
{
v0 += 1;
v1 += 1;
v2 += 1;
v3 += 1;
// create undefined variables, clang will emit these as a llvm undef literal
const int64_t iUndef;
const float fUndef;
const double dUndef;
const HsCall fun = (HsCall)sp[0];
return fun(
baseReg,
sp,
hp,
v0,
v1,
v2,
v3,
iUndef,
iUndef,
spLim,
fUndef,
fUndef,
fUndef,
fUndef,
dUndef,
dUndef);
}
我真的不知道自己在做什么。有没有办法从该博客文章中调整技术?这是个坏主意吗?
答案 0 :(得分:5)
如果您愿意手写汇编,您可以这样做(对于x86_64)。将其放在扩展名为.s
的文件中,并在ghc命令行中将其作为参数提供。
.global sipRound
sipRound:
inc %rbx
inc %r14
inc %rsi
inc %rdi
jmp *(%rbp)
STG寄存器和机器寄存器之间的映射在https://github.com/ghc/ghc/blob/master/includes/stg/MachRegs.h#L159中定义。
请注意,仍然会涉及函数调用,因此它不如您从LLVM获取的代码有效。