import qualified Data.MemoCombinators as Memo
import System.Environment
single = fib' 80000
where fib' = Memo.integral fib''
fib'' 0 = 0
fib'' 1 = 1
fib'' x = fib' (x-1) + fib' (x-2)
doubleFast = fib' 80000 + fib' 80000
where fib' = Memo.integral fib''
fib'' 0 = 0
fib'' 1 = 1
fib'' x = fib' (x-1) + fib' (x-2)
doubleSlow = g 80000 + g 80000
where g x = fib' x
where fib' = Memo.integral fib''
fib'' 0 = 0
fib'' 1 = 1
fib'' x = fib' (x-1) + fib' (x-2)
main = do
args <- getArgs
case args of
["single"] -> print single
["doubleFast"] -> print doubleFast
["doubleSlow"] -> print doubleSlow
./ file single + RTS -s
产量
1,085,072,320 bytes allocated in the heap
387,297,448 bytes copied during GC
265,811,512 bytes maximum residency (10 sample(s))
99,107,440 bytes maximum slop
433 MB total memory in use (0 MB lost due to fragmentation)
Total time 2.78s ( 2.78s elapsed)
./ file doubleFast + RTS -s
产生相同的结果。这对我来说很有意义:由于fib'
在doubleFast
的范围内,fib'
在计算第一个fib' 80000
后不会被丢弃,可用于直接给出第二个fib' 80000
{1}}。
我不明白的是:
./ file doubleSlow + RTS -s
2,166,532,752 bytes allocated in the heap
551,826,896 bytes copied during GC
263,793,848 bytes maximum residency (11 sample(s))
204,460,968 bytes maximum slop
818 MB total memory in use (0 MB lost due to fragmentation)
Total time 14.22s ( 14.24s elapsed)
如果我错了,请纠正我:fib'
用于计算第一个g 80000 = fib' 80000
。保留函数g
,因为g没有记忆,所以不能重复使用它来直接计算第二个g 80000
。此外,在第一次调用g 80000
后,fib'
的查找表将被丢弃,因为它不在doubleSlow
的范围内。
800 MB对我来说很有意义,因为查找表必须构建两次。然而,为什么它需要14.22s而不是2.78s?我预计大约两倍,约。 5.8S。
答案 0 :(得分:3)
使用优化进行编译时,我得到的所有三种方式几乎都是相同的数字 - fib'
和fib''
然后也被提升到doubleSlow
的顶层,因为它们用于其他两个甚至没有优化,所以区别仅在于查找,添加了两个大数字,以及doubleX
与single
相加的结果。
在没有优化的情况下进行编译时,doubleSlow
大约需要4.5倍,而分配大约是其他两个的两倍。差异的主要原因是GC更长,
MUT time 1.64s ( 1.64s elapsed)
GC time 7.09s ( 7.10s elapsed)
MUT time 0.75s ( 0.75s elapsed)
GC time 1.07s ( 1.07s elapsed)
并且计算时间(MUT)大约是两倍,这符合查找数据结构构建两次的事实(您的推理或多或少是正确的,查找表是g
的本地,然后,并且g 80000
被调用两次。
我对GHC的垃圾收集器不太熟悉,以解释为什么收集第一个查找表会花费那么长的默认设置,但是它花费的时间非常依赖于分配区域的大小,我得到了最好的结果设置为3MB,
$ ./memfib +RTS -s -A3M -RTS doubleSlow > /dev/null
2,166,534,656 bytes allocated in the heap
599,670,152 bytes copied during GC
161,411,104 bytes maximum residency (13 sample(s))
99,163,664 bytes maximum slop
414 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 562 colls, 0 par 0.49s 0.49s 0.0009s 0.0041s
Gen 1 13 colls, 0 par 0.62s 0.62s 0.0477s 0.0841s
INIT time 0.00s ( 0.00s elapsed)
MUT time 1.61s ( 1.62s elapsed)
GC time 1.11s ( 1.11s elapsed)
EXIT time 0.03s ( 0.03s elapsed)
Total time 2.76s ( 2.76s elapsed)
%GC time 40.2% (40.2% elapsed)
Alloc rate 1,343,022,682 bytes per MUT second
Productivity 59.8% of total user, 59.7% of total elapsed
与
相比$ ./memfib +RTS -s -A3M -RTS doubleFast > /dev/null
1,085,085,832 bytes allocated in the heap
311,633,528 bytes copied during GC
165,830,256 bytes maximum residency (9 sample(s))
99,121,736 bytes maximum slop
389 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 280 colls, 0 par 0.22s 0.22s 0.0008s 0.0042s
Gen 1 9 colls, 0 par 0.34s 0.34s 0.0376s 0.0792s
INIT time 0.00s ( 0.00s elapsed)
MUT time 0.79s ( 0.79s elapsed)
GC time 0.55s ( 0.56s elapsed)
EXIT time 0.03s ( 0.03s elapsed)
Total time 1.38s ( 1.38s elapsed)
%GC time 40.3% (40.3% elapsed)
Alloc rate 1,378,141,985 bytes per MUT second
Productivity 59.7% of total user, 59.7% of total elapsed
这几乎是两个因素。