我正在为Haskell中的简单命令式语言编写一个编译器,输出Java字节码。我已经到了我发出字节码的抽象表示的地步。
在编写用于编译if语句的代码时遇到了一些麻烦。要实现if语句,我需要跳转到标签。因此,我需要为该标签生成一个名称,该名称必须是唯一的。
我的第一个想法是通过compileStatement
来处理一些状态,即
compileStatement :: Statement -> UniqueIDState -> [AbstractInstruction]
当然,compilerStatement
是递归的,所以使用这种方法需要我从递归调用中将唯一ID生成器的状态传递回upp:
compileStatement :: Statement -> UniqueIDState -> (UniqueIdState, [AbstractInstruction])
这似乎有点笨拙,特别是如果我意识到我需要在将来携带更多状态;有更优雅的方式吗?
答案 0 :(得分:5)
您需要“独特的供应”。在Haskell中执行此操作的常用方法是通过状态monad线程计数器,这会自动化您描述的管道问题。
答案 1 :(得分:4)
正如所说,你可以使用State Monad。或者您可以使用“Unique Supply Monad”。
http://www.haskell.org/ghc//docs/6.10.3/html/libraries/ghc/UniqSupply.html#3
它只是一个国家Moand的包装:
http://en.wikibooks.org/wiki/Haskell/Practical_monads#Make_a_monad_over_state
答案 2 :(得分:2)
如果您仅有的工具是锤子,我想这很诱人,就像对待钉子一样对待一切。
亚伯拉罕·马斯洛。
有什么不一样的地方-一个唯一的供应商,它不是类的成员Monad
。碰巧的是,您几乎可以使用原始类型签名了:
compileStatement :: Statement -> UniqueIDState -> [AbstractInstruction]
如果仅 的要求是每个标签都是唯一的-无需计算使用了多少标签,在相同情况下也可以提供相同的标识符,等等-您可以使用一种侵入性较小的技术
摘自John Launchbury和Simon Peyton Jones的State in Haskell第39-40页:
newUniqueSupply :: IO UniqueSupply
splitUniqueSupply :: UniqueSupply -> (UniqueSupply, UniqueSupply)
getUnique :: UniqueSupply -> Unique
instance Eq Unique
instance Ord Unique
instance Text Unique
-- interface --
-- ================ --
-- implementation --
data UniqueSupply = US Unique UniqueSupply UniqueSupply
type Unique = Int
-- where all the action happens!
newUniqueSupply :: IO UniqueSupply
newUniqueSupply
= newVar 0 `thenST` \ uvar ->
let
next :: IO Unique
next = interleaveST (
readVar uvar `thenST` \ u ->
writeVar uvar (u+1) `thenST_`
returnStrictlyST u
)
supply :: IO UniqueSupply
supply = interleaveST (
next `thenST` \ u ->
supply `thenST` \ s1 ->
supply `thenST` \ s2 ->
returnST (US u s1 s2)
)
in
supply
-- bits so boring they're not even in the paper...
splitUniqueSupply (US _ s1 s2) = (s1, s2)
getUnique (US u _ _) = u
是的... 1996年的Haskell样本-让我们重新整理一下吧:
module UniqueSupply(
Unique, UniqueSupply,
newUniqueSupply, splitUniqueSupply, getUnique
) where
import Control.Monad (liftM3)
import Data.IORef (IORef, newIORef, atomicModifyIORef)
import System.IO.Unsafe (unsafeInterleaveIO)
newtype Unique = U Int deriving (Eq, Ord, Read, Show)
data UniqueSupply = US Unique UniqueSupply UniqueSupply
newUniqueSupply :: IO UniqueSupply
newUniqueSupply
= do uvar <- newIORef 0
let next :: IO Unique
next = unsafeInterleaveIO (atomicModifyIORef uvar (\u -> (u+1, U u)))
supply :: IO UniqueSupply
supply = unsafeInterleaveIO (liftM3 US next supply supply)
supply
splitUniqueSupply :: UniqueSupply -> (UniqueSupply, UniqueSupply)
splitUniqueSupply (US _ s1 s2) = (s1, s2)
getUnique :: UniqueSupply -> Unique
getUnique (US u _ _) = u
现在它又可以工作了,有一些烦恼要处理:
使用两种类型;
缺乏多态性;
固定的生成方式;
错误重用的可能性。
最后一点特别有趣。假设:
data Statement =
... | If Statement Statement Statement | ...
然后,如果:
compileStatement (If c t e) s =
case splitUniqueSupply s of
(s1, s2) -> case splitUniqueSupply s2 of
(s3, s4) -> buildCondJump (compileStatement c s1)
(compileStatement t s3)
(compileStatement e s4)
错误地更改为:
compileStatement (If c t e) s =
case splitUniqueSupply s of
(s1, s2) -> case splitUniqueSupply s2 of
(s3, s4) -> buildCondJump (compileStatement c s)
(compileStatement t s)
(compileStatement e s)
不仅错误地重用了UniqueSupply
和Unique
的值,而且如果对compileStatement
的任何递归调用都大量使用供应,则存在空间泄漏的可能性。
我们现在考虑第二点:缺乏多态性。假设存在合适的类型:
data Fresh a = Fresh a (Fresh a) (Fresh a)
freshNew :: ... -> IO (Fresh a)
splitFresh :: Fresh a -> (Fresh a, Fresh a)
pluckFresh :: Fresh a -> a
这意味着:
instance Functor Fresh where
fmap h (Fresh u s1 s2) = Fresh (h u) (fmap h s1) (fmap h s2)
然后启发:
freshNew :: (Int -> a) -> IO (Fresh a)
freshNew g = fmap (fmap g) freshInts
splitFresh :: Fresh a -> (Fresh a, Fresh a)
splitFresh (Fresh _ s1 s2) = (s1, s2)
pluckFresh :: Fresh a -> a
pluckFresh (Fresh u _ _) = u
然后我们可以将freshInts
设为私有:
freshInts :: IO (Fresh Int)
freshInts = do uvar <- newIORef 0
let incr u = (u+1, u)
next = unsafeInterleaveIO $
atomicModifyIORef uvar incr
supply = unsafeInterleaveIO $
liftM3 Fresh next supply supply
supply
如果用户仅需要Int
值:
do .
.
.
int_supply <- freshNew id {- id x = x -}
.
.
.
作为奖励,它还补救了两种类型的使用以及固定的生成方式(第一点和第三点)。 Fresh
新模块的时间:
module Fresh(
Fresh,
freshNew, splitFresh, pluckFresh
) where
import Control.Monad (liftM3)
import Data.IORef (IORef, newIORef, atomicModifyIORef)
import System.IO.Unsafe (unsafeInterleaveIO)
data Fresh a = Fresh a (Fresh a) (Fresh a)
instance Functor Fresh where
fmap h (Fresh u s1 s2) = Fresh (h u) (fmap h s1) (fmap h s2)
freshNew :: (Int -> a) -> IO (Fresh a)
freshNew g = fmap (fmap g) freshInts
splitFresh :: Fresh a -> (Fresh a, Fresh a)
splitFresh (Fresh _ s1 s2) = (s1, s2)
pluckFresh :: Fresh a -> a
pluckFresh (Fresh u _ _) = u
-- local definition
freshInts :: IO (Fresh Int)
freshInts = do uvar <- newIORef 0
let incr u = (u+1, u)
next = unsafeInterleaveIO $
atomicModifyIORef uvar incr
supply = unsafeInterleaveIO $
liftM3 Fresh next supply supply
supply
现在要进行重用之谜,并尝试尝试一个答案:
freshNew :: (Int -> a) -> IO (Fresh a)
freshNew g = do z <- newIORef ()
fmap (fmap g) (freshInts z)
freshInts :: IORef () -> IO (Fresh Int)
freshInts z = do let using :: () -> (a, ())
using x = (error "already used!", x)
() <- atomicModifyIORef z using
z1 <- newIORef ()
z2 <- newIORef ()
let incr u = (u+1, u)
next = unsafeInterleaveIO $
atomicModifyIORef uvar incr
supply = unsafeInterleaveIO $
liftM3 Fresh next (freshInts z1) (freshInts z2)
supply
...是的,那是一个原型好吗-我们可以做得更好吗?
到目前为止,splitFresh
和pluckFresh
都是微不足道的:
splitFresh :: Fresh a -> (Fresh a, Fresh a)
splitFresh (Fresh _ s1 s2) = (s1, s2)
pluckFresh :: Fresh a -> a
pluckFresh (Fresh u _ _) = u
现在可以将某些作品freshInts
和freshNew
转移给他们吗?
如果splitFresh
可以直接生成一对子树,则Fresh
的值会更简单:
data Fresh a = Fresh ... {- no subtrees -}
splitFresh :: Fresh a -> (Fresh a, Fresh a)
splitFresh (Fresh g ...) = (Fresh ..., Fresh ...)
如果pluckFresh
可以访问生成器功能-g
中的freshNew
,则可以直接提供所需的唯一值:
data Fresh a = Fresh (... -> a) ...
pluckFresh :: Fresh a -> a
pluckFresh (Fresh g ...) = (g ...)
那又怎么样呢?
data Fresh a = Fresh (Int -> a) U
splitFresh :: Fresh a -> (Fresh a, Fresh a)
splitFresh (Fresh g n) = (Fresh g n1, Fresh g n2) where
(n1, n2) = splitU n
pluckFresh :: Fresh a -> a
pluckFresh (Fresh g n) = g (intOfU n)
其中:
splitU :: U -> (U, U)
intOfU :: U -> Int
可以使freshInts
如此简单:
freshInts :: IO (Fresh Int)
freshInts = do n <- initialU
return (Fresh (\x -> x) n)
假设:
initialU :: IO U
嗯-freshInts
的定义不太正确。然后是intOfU
-它与其他地方看到的东西很奇怪……
[...]在命令式程序中,您可以简单地为每个标识符调用
GenSym()
,从全局供应中分配唯一的名称,并对供应产生副作用,以便随后对GenSym()
将带来新的价值。
(摘自Launchbury和Peyton-Jones论文的第39页。)
让我们多加思考:
genSym
由于其(与Haskell外部)与全局(可变)供应源之间的交互,因此其类型为:
genSym :: IO Int
以防止在纯上下文中使用它。
早期的原型具有自己的外部交互功能-它使用可变的引用:
freshInts :: IORef () -> IO (Fresh Int)
防止重复使用Fresh
值。
...带有抽象的IO
类型,通过两种类型签名中的出现来指示那些外部交互的存在。
如果我们(谨慎地!)假设:
intOfU
是genSym
的伪装;
U
类型也可以作为外界互动的指标;
即:
type U = OI
genSym :: OI -> Int
intOfU :: U -> Int
intOfU = ... $ genSym
...这意味着:
data Fresh a = Fresh (Int -> a) OI
splitU :: OI -> (OI, OI)
这看起来很有希望-我们现在可以将genSym
移至freshInts
:
data Fresh a = Fresh (OI -> a) OI
pluckFresh :: Fresh a -> a
pluckFresh (Fresh g n) = g n
freshInts :: IO (Fresh Int)
freshInts = do n <- initialU
uvar <- newIORef 0
let incr n = (n + 1, n)
genSym :: IO Int
genSym = atomicModifyIORef uvar incr
intOfU :: OI -> Int
intOfU = ... $ genSym
return (Fresh intOfU n)
这看起来更明智-其他问题呢?
instance Functor Fresh where
fmap f (Fresh g n) = Fresh (f . g) n
freshNew :: (Int -> a) -> IO (Fresh a)
freshNew g = do n <- initialU
uvar <- newIORef 0
let incr n = (n + 1, n)
genSym :: IO Int
genSym = atomicModifyIORef uvar incr
intOfU :: OI -> Int
intOfU = ... $ genSym
return (Fresh (g . intOfU) n)
这看起来非常有前景-我们不再需要本地定义freshInts
!我们只需要定义 U
OI
,initialU
和splitU
-,就需要考虑一些事项:
还记得Fresh
中错误地重用compileStatement
值的问题吗?好吧,这些OI
值也存在相同的问题:
pourFresh :: Fresh a -> [a]
pourFresh (Fresh g n) = map g (pourU n)
pourU :: OI -> [OI]
pourU n = n1 : pourU n1 where (n1, n2) = splitU n
OI
类型的构造函数的现成可用性将使该问题更加严重。
我们仍然假设此OI
类型表明存在外部互动-很像这种简单的IO
...
这表明OI
类型应该是抽象的。当我们正在处理原型并且您可能已经在使用它时,最简单的选择也许就是根据需要使用 Glaskell GHC扩展。
深呼吸的时间,并进行了一些更改:
-- for GHC 8.6.5
{-# LANGUAGE MagicHash, UnboxedTuples #-}
import Data.Char (isSpace)
import Data.IORef (IORef, newIORef, atomicModifyIORef)
import Prelude (Int, String, Eq(..), Functor(..), Num(..))
import Prelude ((.), ($), (++), error, all)
import GHC.Base (IO(..), State#, MutVar#, RealWorld)
import GHC.Base (seq#, newMutVar#, atomicModifyMutVar#, noDuplicate#)
data OI = OI OI#
type OI# = String -> State# RealWorld
type IO# a = State# RealWorld -> (# State# RealWorld, a #)
part# :: OI# -> (# OI#, OI# #)
part# h = case h "partOI" of
s -> case dispense# s of
(# s', h1 #) ->
case dispense# s' of
(# _, h2 #) -> (# h1, h2 #)
dispense# :: IO# OI#
dispense# s = case newMutVar# () s of
(# s', r #) -> (# s', expire# s' r #)
expire# :: State# s -> MutVar# s () -> String -> State# s
expire# s r name = case atomicModifyMutVar# r use s of
(# s', () #) -> s'
where
use x = (error nowUsed, x)
nowUsed = name' ++ ": already expired"
name' = if all isSpace name then "(unknown)"
else name
invokes# :: Monomo a => String -> IO# a -> OI# -> a
(name `invokes#` act) h = case act (noDuplicate# (h name)) of (# _, t #) -> t
class Monomo a
您现在可以再次开始呼吸;这里也有一些名称更改:
partFresh :: Fresh a -> (Fresh a, Fresh a)
partFresh (Fresh g u) = case partOI u of
(u1, u2) -> (Fresh g u1, Fresh g u2)
pluckFresh :: Fresh a -> a
pluckFresh (Fresh g u) = g u
freshNew :: (Int -> a) -> IO (Fresh a)
freshNew g = do uvar <- newIORef 0
let incr n = (n + 1, n)
genSym :: IO Int
genSym = atomicModifyIORef uvar incr
gensym :: OI -> Int
gensym = "gensym" `invokes` genSym
runOI (Fresh (g . gensym))
instance Monomo Int
因此,您已经拥有了:一个简单的唯一电源(除了一个定义)是不含monad的:
您只需要定义例如:
nextID :: Int -> ID
对类型签名的最终更改是适度的:
compileStatement :: Statement -> Fresh ID -> [AbstractInstruction]
但是,如果您确实需要 do
,则可以使用Fresh
作为单子类型的基础,例如:
type Supply i a = Fresh i -> a
unit :: a -> Supply i a
unit x = \u -> partFresh u `seq` x
bind :: Supply i a -> (a -> Supply i b) -> Supply i b
bind m k = \u -> case partFresh u of (u1, u2) -> (\x -> x `seq` k x u2) (m u1)
其中:
-- for GHC 8.6.5
{-# LANGUAGE CPP #-}
#define during seq
import qualified Prelude(during)
{-# NOINLINE seq #-}
infixr 0 `seq`
seq :: a -> b -> b
seq x y = Prelude.during x (case x of _ -> y)
或:
-- for GHC 8.6.5
{-# LANGUAGE CPP #-}
#define during seq
import qualified Prelude(during)
import GHC.Base(lazy)
infixr 0 `seq`
seq :: a -> b -> b
seq x y = Prelude.during x (lazy y)
...因为Prelude.seq
isn't actually sequential。
(是的:这些定义是特定于GHC的;对于其他Haskell实现,最简单的选择可能是添加一个新的原语。对于扩展本身,它们与每个定义一起使用。)
嗯...很有趣:
-- for GHC 8.6.5
{-# LANGUAGE MagicHash, UnboxedTuples #-}
module Fresh(
Fresh,
freshNew, partFresh, pluckFresh
) where
import Data.Char (isSpace)
import Data.IORef (IORef, newIORef, atomicModifyIORef)
import Prelude (Int, String, Eq(..), Functor(..), Num(..))
import Prelude ((.), ($), (++), error, all)
import GHC.Base (IO(..), State#, MutVar#, RealWorld)
import GHC.Base (seq#, newMutVar#, atomicModifyMutVar#, noDuplicate#)
partFresh :: Fresh a -> (Fresh a, Fresh a)
partFresh (Fresh g u) = case partOI u of
(u1, u2) -> (Fresh g u1, Fresh g u2)
pluckFresh :: Fresh a -> a
pluckFresh (Fresh g u) = g u
freshNew :: (Int -> a) -> IO (Fresh a)
freshNew g = do uvar <- newIORef 0
let incr n = (n + 1, n)
genSym :: IO Int
genSym = atomicModifyIORef uvar incr
gensym :: OI -> Int
gensym = "gensym" `invokes` genSym
runOI (Fresh (g . gensym))
instance Functor Fresh where
fmap f (Fresh g n) = Fresh (f . g) n
-- local definitions --
data Fresh a = Fresh (OI -> a) OI
partOI :: OI -> (OI, OI)
partOI (OI h) = case part# h of (# h1, h2 #) -> (OI h1, OI h2)
runOI :: (OI -> a) -> IO a
runOI g = IO $ \s -> case dispense# s of
(# s', h #) -> seq# (g (OI h)) s'
invokes :: Monomo a => String -> IO a -> OI -> a
(name `invokes` IO act) (OI h)
= (name `invokes#` act) h
class Monomo a
-- extended definitions --
data OI = OI OI#
type OI# = String -> State# RealWorld
type IO# a = State# RealWorld -> (# State# RealWorld, a #)
part# :: OI# -> (# OI#, OI# #)
part# h = case h "partOI" of
s -> case dispense# s of
(# s', h1 #) ->
case dispense# s' of
(# _, h2 #) -> (# h1, h2 #)
dispense# :: IO# OI#
dispense# s = case newMutVar# () s of
(# s', r #) -> (# s', expire# s' r #)
expire# :: State# s -> MutVar# s () -> String -> State# s
expire# s r name = case atomicModifyMutVar# r use s of
(# s', () #) -> s'
where
use x = (error nowUsed, x)
nowUsed = name' ++ ": already expired"
name' = if all isSpace name then "(unknown)"
else name
invokes# :: Monomo a => String -> IO# a -> OI# -> a
(name `invokes#` act) h = case act (noDuplicate# (h name)) of (# _, t #) -> t
-- supplemental instances --
instance Monomo Int
...我们甚至设法摆脱了unsafe...
的定义-很好!
P.S:如果您想知道特殊的Monomo
类,可以在Standard ML ...的历史中找到线索...