在下面的程序中,我只希望cycle3
以恒定的内存运行(它明确地结合了结)。但是,由于我无法理解的原因,cycle2
也会在常量内存中运行。我希望cycle2
能够完成与cycle1
完全相同的工作,因为
xs' = xs ++ xs'
xs' = xs ++ xs ++ xs' -- substitute value of xs'
xs' = xs ++ xs ++ xs ++ xs' -- substitute value of xs' again
xs' = xs ++ xs ++ xs ++ ... -- and so on
有人可以解释我在这里缺少的东西吗?
module Main where
import System.Environment (getArgs)
cycle1 :: [a] -> [a]
cycle1 [] = error "empty list"
cycle1 xs = xs ++ cycle1 xs
cycle2 :: [a] -> [a]
cycle2 [] = error "empty list"
cycle2 xs = xs' where xs' = xs ++ xs'
cycle3 :: [a] -> [a]
cycle3 [] = error "empty list"
cycle3 xs = let
xs' = go xs' xs
in xs'
where
go :: [a] -> [a] -> [a]
go start [last] = last : start
go start (x:xs) = x : go start xs
testMem :: (Show a) => ([a] -> [a]) -> [a] -> IO ()
testMem f xs = print (xs', xs') -- prints only first val, need second val holds onto reference
where
xs' = f xs
main :: IO ()
main = do
args <- getArgs
let mCycleFunc = case args of
["1"] -> Just cycle1
["2"] -> Just cycle2
["3"] -> Just cycle3
_ -> Nothing
case mCycleFunc of
Just cycleFunc -> testMem cycleFunc [0..8]
Nothing -> putStrLn "Valid args are one of {1, 2, 3}."
答案 0 :(得分:6)
cycle1
都会创建一个新列表。这应该是显而易见的原因。
cycle2
并没有这样做。它创建了一个变量xs'
,它在自己的定义中使用。在cycle1
中,每次消耗cycle1
时都必须重新评估xs
函数,但在cycle2
中,它不具有任何递归函数。它只引用了已经具有已知值的相同变量。
答案 1 :(得分:3)
归结为共享或不共享相同的thunk。两个相同的thunks是保证产生相同结果的thunks。在cycle1
的情况下,每当您点击cycle1 xs
末尾的[]
时,您就会为xs
创建一个新的thunk。需要为该thunk分配新的内存,并且需要从头开始计算其值,这会在您完成时分配新的列表对。
如果您将cycle2
重命名为xs'
(我删除了result
上的错误“),我认为[]
避免这种情况变得更容易理解:< / p>
cycle2 :: [a] -> [a]
cycle2 xs = result
where result = xs ++ result
此定义在语义上等同于cycle1
(对于相同的参数产生相同的结果),但理解内存使用的关键是根据创建的thunk来查看它。当您为此函数执行已编译的代码时,它立即执行的只是为result
创建一个thunk。您可以将thunk视为可变类型,或多或少像这样(完全伪造的伪代码):
type Thunk a = union { NotDone (ThunkData a), Done a }
type ThunkData a = struct { force :: t0 -> ... -> tn -> a
, subthunk0 :: t0
, ...
, subthunkn :: tn }
这是一个记录,其中包含指向所需值的thunk的指针,以及指向强制这些thunk的代码的指针,或者只是计算结果。在cycle2
的情况下,result
的thunk指向(++)
的对象代码以及xs
和result
的thunk。最后一位意味着result
的thunk有一个指向自身的指针,这解释了恒定的空间行为;强制result
的最后一步是让它回归自身。
在cycle1
的情况下,另一方面,thunk具有(++)
的代码,xs
的thunk和 new thunk to从头开始计算cycle1 xs
。原则上,编译器可能会认识到对后一个thunk的引用可以用一个替换为“父”块,但编译器不这样做;而在cycle2
中,它无能为力(一个变量的实例化绑定=一个块)。
请注意,这种自引用thunk行为可以在fix
的适当实现中考虑:
-- | Find the least fixed point of @f@. This implementation should produce
-- self-referential thunks, and thus run in constant space.
fix :: (a -> a) -> a
fix f = result
where result = f result
cycle4 :: [a] -> [a]
cycle4 xs = fix (xs++)