I am following along the book Implementing Functional Languages: A Tutorial and have basically finished a Core compiler and interpreter using the G-Machine. In the description of the chapters, it is said that G-Code can be translated into machine code, but how? The G-Code (instructions) are meant to run a state transition machine, manipulating multiple stacks and association lists. How can something like this be done in LLVM using llvm-general/llvm-general-pure packages? At the end of it all, I'd like to make this compiler output an executable file.
Here is how the state and the instructions look like:
type GmState = (GmOutput, -- current output
GmCode, -- current instruction stream
GmStack, -- current stack
GmDump, -- a stack for WHNF reductions
GmHeap, -- heap of nodes
GmGlobals, -- global addresses in heap
GmStats) -- statistics
type GmOutput = [Char]
type GmCode = [Instruction]
type GmStack = [Addr]
type GmDump = [GmDumpItem]
type GmDumpItem = (GmCode, GmStack)
type GmHeap = Heap Node
type GmGlobals = ASSOC Name Addr
type GmStats = Int
data Instruction = Unwind -- unravels the spine of the evaluation tree (such as evaluating the next super combinator)
| Pushglobal Name -- push address of global to stack
| Pushint Int -- allocate node in heap, add address to stack
| Push Int -- pushes the nth address in the stack to the top of the stack
| Mkap -- takes first 2 pointers on stack and adds a new address to the top of the stack to both addresses
| Update Int -- updates the nth address in the stack in the heap, takes the address of the top of the stack
| Pop Int -- pops the first n addresses from the stack
| Slide Int -- like pop, but keeps the first address on top of the stack
| Alloc Int -- allocates n nodes in the heap and puts them on top of the stack
| Eval -- takes current code and puts it in the dump, evaluates the top address by unwinding
| Add | Sub | Mul | Div | Neg
| Eq | Ne | Lt | Le | Gt | Ge
| Cond GmCode GmCode
| Pack Int Int -- creates a new datatype by adding a unique name to the heap
| Casejump [(Int, GmCode)] -- evaluates a number of cases by matching alts with datatypes in the heap
| Split Int -- looks up first address of stack in heap, if exists then takes the list of addresses found and appends to stack
| Print -- write to GmOutput
What I would essentially like to know is this: is there any LLVM data structure that can mimic the GMachine, and if there is, what is it, and is a stack machine within LLVM the right approach?