如何折叠惰性类型列表([IO a])?

时间:2012-01-21 15:47:07

标签: performance haskell

我不知道它是否是一个有效的术语'懒惰类型'。但是,IO仍然是懒惰的

import Control.Monad
import Data.List

result :: IO Double
result = foldl1' (liftM2 (+)) $ map return [1..10000000]

result' :: IO Double
result' = return $ foldl1' (+) [1..10000000]
result不同,

result'速度慢并且占用大量内存。我该如何折叠[IO a]

3 个答案:

答案 0 :(得分:7)

result构造一个大的IO Double值,而不评估任何中间结果,只有在需要总结果时才会发生,例如用于印刷。 foldl'将中间结果评估为弱头正常形式,即最外层构造函数或lambda。由于(在GHC中),IO a具有构造函数IO,因此折叠的中间结果具有

形式
IO (some computation combined with another)

并且IO下的表达式在每一步都变得更加复杂。

为避免这种情况,您不仅要强制使用中间IO值,还要强制它们返回的值,

main :: IO ()
main = foldlM' (\a -> fmap (a+)) 0 (map return [1.0 .. 10000000]) >>= print

foldlM' :: Monad m => (a -> b -> m a) -> a -> [b] -> m a
foldlM' foo a [] = return a
foldlM' foo a (b:bs) = do
    c <- foo a b
    c `seq` foldlM' foo c bs

适用于您的示例。

答案 1 :(得分:4)

问题是foldl'只会在每一步将累加器减少到WHNF。在这种情况下,累加器是IO动作,并且评估IO动作不会强制该值,因此您最终会得到直到结束时才会被评估的典型巨大thunk。 / p>

解决方案是使用比liftM2更严格的东西,例如:

result'' :: IO Double
result'' = foldl1' f $ map return [1..100000]
    where f mx my = do x <- mx; y <- my; return $! x + y

这是一个快速的基准:

import Control.Monad
import Data.List
import Criterion.Main

result :: IO Double
result = foldl1' (liftM2 (+)) $ map return [1..100000]

result' :: IO Double
result' = return $ foldl1' (+) [1..100000]

result'' :: IO Double
result'' = foldl1' f $ map return [1..100000]
  where f mx my = do x <- mx; y <- my; return $! x + y

main = defaultMain [ bench "result" $ whnfIO result
                   , bench "result'" $ whnfIO result'
                   , bench "result''" $ whnfIO result'' ]

结果:

[...]
benchmarking result
collecting 100 samples, 1 iterations each, in estimated 37.32438 s
mean: 136.3221 ms, lb 131.4504 ms, ub 140.8238 ms, ci 0.950
std dev: 23.92297 ms, lb 22.00429 ms, ub 25.53803 ms, ci 0.950

benchmarking result'
collecting 100 samples, 14 iterations each, in estimated 6.046951 s
mean: 4.349027 ms, lb 4.338121 ms, ub 4.367363 ms, ci 0.950
std dev: 70.96316 us, lb 49.01322 us, ub 113.0399 us, ci 0.950

benchmarking result''
collecting 100 samples, 2 iterations each, in estimated 8.131099 s
mean: 41.89589 ms, lb 40.67513 ms, ub 43.52798 ms, ci 0.950
std dev: 7.194770 ms, lb 5.758892 ms, ub 8.529327 ms, ci 0.950

正如您所看到的,这仍然比纯代码慢,但不是那么多。

答案 2 :(得分:0)

基于Daniel的回答,该变体允许直接实现foldl1'的模拟:

result'' :: IO Double
result'' = foldlM1' (+) (map return [1.0 .. 100000000])

foldlM' :: Monad m => (a -> b -> a) -> m a -> [m b] -> m a
foldlM' foo ma [] = ma
foldlM' foo ma (mb:mbs) = do
    c <- (liftM2 foo) ma mb
    c `seq` foldlM' foo (return c) mbs

foldlM1' foo (x:xs) = foldlM' foo x xs
foldlM1' _ []       = error "Empty list for foldlM1'"