Question

I am following along the book Implementing Functional Languages: A Tutorial and have basically finished a Core compiler and interpreter using the G-Machine. In the description of the chapters, it is said that G-Code can be translated into machine code, but how? The G-Code (instructions) are meant to run a state transition machine, manipulating multiple stacks and association lists. How can something like this be done in LLVM using llvm-general/llvm-general-pure packages? At the end of it all, I'd like to make this compiler output an executable file.

Here is how the state and the instructions look like:

type GmState = (GmOutput,   -- current output
            GmCode,     -- current instruction stream
            GmStack,    -- current stack
            GmDump,     -- a stack for WHNF reductions
            GmHeap,     -- heap of nodes
            GmGlobals,  -- global addresses in heap
            GmStats)    -- statistics

type GmOutput = [Char]

type GmCode = [Instruction]

type GmStack = [Addr]

type GmDump = [GmDumpItem]
type GmDumpItem = (GmCode, GmStack)

type GmHeap = Heap Node

type GmGlobals = ASSOC Name Addr

type GmStats = Int

data Instruction = Unwind -- unravels the spine of the evaluation tree (such as evaluating the next super combinator)
             | Pushglobal Name -- push address of global to stack
             | Pushint Int -- allocate node in heap, add address to stack
             | Push Int -- pushes the nth address in the stack to the top of the stack
             | Mkap -- takes first 2 pointers on stack and adds a new address to the top of the stack to both addresses
             | Update Int -- updates the nth address in the stack in the heap, takes the address of the top of the stack
             | Pop Int -- pops the first n addresses from the stack
             | Slide Int -- like pop, but keeps the first address on top of the stack
             | Alloc Int -- allocates n nodes in the heap and puts them on top of the stack
             | Eval -- takes current code and puts it in the dump, evaluates the top address by unwinding
             | Add | Sub | Mul | Div | Neg
             | Eq | Ne | Lt | Le | Gt | Ge 
             | Cond GmCode GmCode
             | Pack Int Int -- creates a new datatype by adding a unique name to the heap
             | Casejump [(Int, GmCode)] -- evaluates a number of cases by matching alts with datatypes in the heap
             | Split Int -- looks up first address of stack in heap, if exists then takes the list of addresses found and appends to stack
             | Print -- write to GmOutput

What I would essentially like to know is this: is there any LLVM data structure that can mimic the GMachine, and if there is, what is it, and is a stack machine within LLVM the right approach?

How do I translate G-Code to LLVM?

0 个答案: