循环功能的意外内存使用情况

时间:2012-07-09 21:52:30

标签: memory haskell cycle

在下面的程序中,我只希望cycle3以恒定的内存运行(它明确地结合了结)。但是,由于我无法理解的原因,cycle2也会在常量内存中运行。我希望cycle2能够完成与cycle1完全相同的工作,因为

xs' = xs ++ xs'
xs' = xs ++ xs ++ xs' -- substitute value of xs'
xs' = xs ++ xs ++ xs ++ xs' -- substitute value of xs' again
xs' = xs ++ xs ++ xs ++ ... -- and so on

有人可以解释我在这里缺少的东西吗?


module Main where

import System.Environment (getArgs)

cycle1 :: [a] -> [a]
cycle1 [] = error "empty list"
cycle1 xs = xs ++ cycle1 xs

cycle2 :: [a] -> [a]
cycle2 [] = error "empty list"
cycle2 xs = xs' where xs' = xs ++ xs'

cycle3 :: [a] -> [a]
cycle3 [] = error "empty list"
cycle3 xs = let
  xs' = go xs' xs
  in xs'
  where
    go :: [a] -> [a] -> [a]
    go start [last] = last : start
    go start (x:xs) = x : go start xs

testMem :: (Show a) => ([a] -> [a]) -> [a] -> IO ()
testMem f xs = print (xs', xs') -- prints only first val, need second val holds onto reference
  where
    xs' = f xs

main :: IO ()
main = do
  args <- getArgs
  let mCycleFunc = case args of
        ["1"] -> Just cycle1
        ["2"] -> Just cycle2
        ["3"] -> Just cycle3
        _ -> Nothing
  case mCycleFunc of
    Just cycleFunc -> testMem cycleFunc [0..8]
    Nothing -> putStrLn "Valid args are one of {1, 2, 3}."

2 个答案:

答案 0 :(得分:6)

每次消耗一个周期时,

cycle1都会创建一个新列表。这应该是显而易见的原因。

然而,

cycle2并没有这样做。它创建了一个变量xs',它在自己的定义中使用。在cycle1中,每次消耗cycle1时都必须重新评估xs函数,但在cycle2中,它不具有任何递归函数。它只引用了已经具有已知值的相同变量。

答案 1 :(得分:3)

归结为共享或不共享相同的thunk。两个相同的thunks是保证产生相同结果的thunks。在cycle1的情况下,每当您点击cycle1 xs末尾的[]时,您就会为xs创建一个新的thunk。需要为该thunk分配新的内存,并且需要从头开始计算其值,这会在您完成时分配新的列表对。

如果您将cycle2重命名为xs'(我删除了result上的错误“),我认为[]避免这种情况变得更容易理解:< / p>

cycle2 :: [a] -> [a]
cycle2 xs = result 
    where result = xs ++ result

此定义在语义上等同于cycle1(对于相同的参数产生相同的结果),但理解内存使用的关键是根据创建的thunk来查看它。当您为此函数执行已编译的代码时,它立即执行的只是为result创建一个thunk。您可以将thunk视为可变类型,或多或少像这样(完全伪造的伪代码):

type Thunk a = union { NotDone (ThunkData a), Done a }
type ThunkData a = struct { force :: t0 -> ... -> tn -> a
                          , subthunk0 :: t0
                          , ...
                          , subthunkn :: tn }

这是一个记录,其中包含指向所需值的thunk的指针,以及指向强制这些thunk的代码的指针,或者只是计算结果。在cycle2的情况下,result的thunk指向(++)的对象代码以及xsresult的thunk。最后一位意味着result的thunk有一个指向自身的指针,这解释了恒定的空间行为;强制result的最后一步是让它回归自身。

cycle1的情况下,另一方面,thunk具有(++)的代码,xs的thunk和 new thunk to从头开始计算cycle1 xs。原则上,编译器可能会认识到对后一个thunk的引用可以用一个替换为“父”块,但编译器不这样做;而在cycle2中,它无能为力(一个变量的实例化绑定=一个块)。

请注意,这种自引用thunk行为可以在fix的适当实现中考虑:

-- | Find the least fixed point of @f@.  This implementation should produce
-- self-referential thunks, and thus run in constant space.
fix :: (a -> a) -> a
fix f = result
    where result = f result

cycle4 :: [a] -> [a]
cycle4 xs = fix (xs++)