Question

我刚刚写了一段Haskell代码，为了调试我的代码，我在代码中放了一堆print语句（所以，我最重要的函数返回IO t，当它需要返回时t）我看到这个功能在成功运行时会占用大量内存（大约1.2GB）。一旦我看到程序工作正常，我从函数中删除了所有打印语句并运行它，只是意识到它给了我这个错误：

Stack space overflow: current size 8388608 bytes.
Use `+RTS -Ksize -RTS' to increase it.

即使它是完全相同的代码片段，由于某种原因，print语句使它忽略了堆栈空间溢出。任何人都可以告诉我为什么会这样吗？

我知道我还没有提供我的代码，这可能会让我更难回答这个问题，但是我已经把很多东西一起砍掉了，它看起来并不是很漂亮，所以我怀疑它将是有用的，我相当确定唯一的区别是打印陈述。

编辑：

由于人们真的想看到这里的代码是相关部分：

linkCallers :: ([Int], Int, Int, I.IntDisjointSet, IntMap Int) -> ([Int], Int, Int, I.IntDisjointSet, IntMap Int)
linkCallers ([], x, y, us, im) = ([], x, y, us, im) 
linkCallers ((a:b:r), x, y, us, im) = if a == b
    then (r, x, y+1, us, im) 
    else if sameRep
        then (r, x+1, y+1, us, im) 
        else (r, x+1, y+1, us', im')
        where
            ar = fst $ I.lookup a us
            br = fst $ I.lookup b us  
            sameRep = case ar of
                Nothing -> False
                _ -> ar == br
            as' = ar >>= flip lookup im
            bs' = br >>= flip lookup im
            totalSize = do
                asize <- as' 
                bsize <- bs' 
                return $ asize + bsize
            maxSize = (convertMaybe as') + (convertMaybe bs')
            us' = I.union a b $ I.insert a $ I.insert b $ us
            newRep = fromJust $ fst $ I.lookup a us' 
            newRep' = fromJust $ fst $ I.lookup b us' 
            im'' = case ar of
                Nothing -> case br of
                    Nothing -> im
                    Just bk -> delete bk im
                Just ak -> delete ak $ case br of
                    Nothing -> im
                    Just bk -> delete bk im
            im' = case totalSize of  
                Nothing -> insert newRep maxSize im''
                Just t -> insert newRep t im''

startLinkingAux' (c,x,y,us,im) = let t@(_,x',_,us',im') = linkCallers (c,x,y,us,im) in
    case (fst $ I.lookup primeMinister us') >>= flip lookup im' >>= return . (>=990000) of
        Just True -> x'
        _ -> startLinkingAux' t

startLinkingAux'曾经看起来像这样：

startLinkingAux' (c,x,y,us,im) = do
    print (c,x,y,us,im)
    let t@(_,x',_,us',im') = linkCallers (c,x,y,us,im) in
    case (fst $ I.lookup primeMinister us') >>= flip lookup im' >>= return . (>=990000) of
        Just True -> return x'
        _ -> startLinkingAux' t

Answer 1

其中一个参数可能存在内存泄漏。可能我尝试的第一件事就是要求disjoint-set的作者为RFData添加IntDisjointSet个实例（或者自己动手，查看源代码，它会相当容易）。然后尝试对linkCallers返回的所有值调用force，看看是否有帮助。

其次，你没有使用disjoint-set权利。该算法的主要思想是查找压缩集合中的路径。这就是它的优异表现！因此，每次进行查找时，都应该用新的替换旧的集合。但是这使得在函数式语言中使用不相交的集非常笨拙。它建议使用State monad，并在linkCallers内部使用它，作为一个大的do块而不是where，只需传递起始集并提取最后一个。并定义像

这样的函数

insertS :: (MonadState IntDisjointSet m) => Int -> m ()
insertS x = modify (insert x)

lookupS :: (MonadState IntDisjointSet m) => Int -> m (Maybe Int)
lookupS x = state (lookup x)

-- etc

在State内使用。（也许他们对图书馆是一个很好的贡献，这可能是一个普遍的问题。）

最后，有许多小改进可以使代码更具可读性：

很多时候，您将单个函数应用于两个值。我建议定义像

这样的东西

onPair :: (a -> b) -> (a, a) -> (b, b)
onPair f (x, y) = (f x, f y)
-- and use it like:
(ar, br) = onPair (fst . flip I.lookup us) (a, b)

同样使用Applicative函数可以缩短范围：

sameRep = fromMaybe False $ (==) <$> ar <*> br
totalSize = (+) <$> as' <*> bs'

然后

im'' = maybe id delete ar . maybe id delete br $ im
im' = insert newRep (fromJust maxSize totalSize) im''

希望它有所帮助。

Haskell内存使用和IO

1 个答案: