以下两个程序的区别仅在于变量st
上的严格标志$ cat testStrictL.hs
module Main (main) where
import qualified Data.Vector as V
import qualified Data.Vector.Generic as GV
import qualified Data.Vector.Mutable as MV
len = 5000000
testL = do
mv <- MV.new len
let go i = do
if i >= len then return () else
do let st = show (i+10000000) -- no strictness flag
MV.write mv i st
go (i+1)
go 0
v <- GV.unsafeFreeze mv :: IO (V.Vector String)
return v
main =
do
v <- testL
print (V.length v)
mapM_ print $ V.toList $ V.slice 4000000 5 v
$ cat testStrictS.hs
module Main (main) where
import qualified Data.Vector as V
import qualified Data.Vector.Generic as GV
import qualified Data.Vector.Mutable as MV
len = 5000000
testS = do
mv <- MV.new len
let go i = do
if i >= len then return () else
do let !st = show (i+10000000) -- this has the strictness flag
MV.write mv i st
go (i+1)
go 0
v <- GV.unsafeFreeze mv :: IO (V.Vector String)
return v
main =
do
v <- testS
print (V.length v)
mapM_ print $ V.toList $ V.slice 4000000 5 v
使用ghc 7.03在Ubuntu 10.10上编译和运行这两个程序 我得到以下结果
$ ghc --make testStrictL.hs -O3 -rtsopts [2 of 2] Compiling Main ( testStrictL.hs, testStrictL.o ) Linking testStrictL ... $ ghc --make testStrictS.hs -O3 -rtsopts [2 of 2] Compiling Main ( testStrictS.hs, testStrictS.o ) Linking testStrictS ... $ ./testStrictS +RTS -sstderr ./testStrictS +RTS -sstderr 5000000 "14000000" "14000001" "14000002" "14000003" "14000004" 824,145,164 bytes allocated in the heap 1,531,590,312 bytes copied during GC 349,989,148 bytes maximum residency (6 sample(s)) 1,464,492 bytes maximum slop 656 MB total memory in use (0 MB lost due to fragmentation) Generation 0: 1526 collections, 0 parallel, 5.96s, 6.04s elapsed Generation 1: 6 collections, 0 parallel, 2.79s, 4.36s elapsed INIT time 0.00s ( 0.00s elapsed) MUT time 1.77s ( 2.64s elapsed) GC time 8.76s ( 10.40s elapsed) EXIT time 0.00s ( 0.13s elapsed) Total time 10.52s ( 13.04s elapsed) %GC time 83.2% (79.8% elapsed) Alloc rate 466,113,027 bytes per MUT second Productivity 16.8% of total user, 13.6% of total elapsed $ ./testStrictL +RTS -sstderr ./testStrictL +RTS -sstderr 5000000 "14000000" "14000001" "14000002" "14000003" "14000004" 81,091,372 bytes allocated in the heap 143,799,376 bytes copied during GC 44,653,636 bytes maximum residency (3 sample(s)) 1,005,516 bytes maximum slop 79 MB total memory in use (0 MB lost due to fragmentation) Generation 0: 112 collections, 0 parallel, 0.54s, 0.59s elapsed Generation 1: 3 collections, 0 parallel, 0.41s, 0.45s elapsed INIT time 0.00s ( 0.03s elapsed) MUT time 0.12s ( 0.18s elapsed) GC time 0.95s ( 1.04s elapsed) EXIT time 0.00s ( 0.06s elapsed) Total time 1.06s ( 1.24s elapsed) %GC time 89.1% (83.3% elapsed) Alloc rate 699,015,343 bytes per MUT second Productivity 10.9% of total user, 9.3% of total elapsed
有人可以解释为什么严格标志似乎导致程序使用这么多 记忆?这个简单的例子来自于我试图理解我的程序的原因 在读取500万行的大文件并创建向量时使用如此多的内存 的记录。
答案 0 :(得分:8)
这里的问题主要是您使用String
([Char]
)的别名类型,因为它表示为单Char
的非严格列表s在内存堆上每个字符需要5个字(有关内存占用比较,另请参阅this blog article)
在懒惰的情况下,你基本上存储一个指向(共享)评估函数show . (+10000000)
的未评估thunk和一个变化的整数,而在严格的情况下,由8个字符组成的完整字符串似乎已实现(通常bang-pattern只会强制最外层的list-constructor :
,即lazy String
的第一个字符被评估),这需要更多的堆空间,字符串变得越长。
存储5000000 String
- 长度为8的类型字符串因此需要5000000 * 8 * 5 = 200000000个字,其在32位上对应于约~763 MiB。如果共享Char
个数字,则只需要3/5,即〜458 MiB,这似乎与您观察到的内存开销相匹配。
如果您使用更紧凑的内容(例如String
)替换Data.ByteString.ByteString
,您会注意到与使用普通String
相比,内存开销会降低一个数量级。