import Data.List
test :: Int -> Int
test n = foldl' (+) 0 [1..n]
main :: IO ()
main = do
print $ test $ 10^8
GHC优化上述代码,以至于垃圾收集器甚至不需要做任何事情:
$ ghc -rtsopts -O2 testInt && ./testInt +RTS -s
[1 of 1] Compiling Main ( testInt.hs, testInt.o )
Linking testInt ...
5000000050000000
51,752 bytes allocated in the heap
3,480 bytes copied during GC
44,384 bytes maximum residency (1 sample(s))
17,056 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 0 colls, 0 par 0.000s 0.000s 0.0000s 0.0000s
Gen 1 1 colls, 0 par 0.000s 0.000s 0.0001s 0.0001s
INIT time 0.000s ( 0.000s elapsed)
MUT time 0.101s ( 0.101s elapsed)
GC time 0.000s ( 0.000s elapsed)
EXIT time 0.000s ( 0.000s elapsed)
Total time 0.103s ( 0.102s elapsed)
%GC time 0.1% (0.1% elapsed)
Alloc rate 511,162 bytes per MUT second
Productivity 99.8% of total user, 100.9% of total elapsed
但是,如果我将test
的类型更改为test :: Word -> Word
,则会产生大量垃圾,代码运行速度会慢40倍。
ghc -rtsopts -O2 testWord && ./testWord +RTS -s
[1 of 1] Compiling Main ( testWord.hs, testWord.o )
Linking testWord ...
5000000050000000
11,200,051,784 bytes allocated in the heap
1,055,520 bytes copied during GC
44,384 bytes maximum residency (2 sample(s))
21,152 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 21700 colls, 0 par 0.077s 0.073s 0.0000s 0.0000s
Gen 1 2 colls, 0 par 0.000s 0.000s 0.0001s 0.0001s
INIT time 0.000s ( 0.000s elapsed)
MUT time 4.551s ( 4.556s elapsed)
GC time 0.077s ( 0.073s elapsed)
EXIT time 0.000s ( 0.000s elapsed)
Total time 4.630s ( 4.630s elapsed)
%GC time 1.7% (1.6% elapsed)
Alloc rate 2,460,957,186 bytes per MUT second
Productivity 98.3% of total user, 98.3% of total elapsed
为什么会这样?我预计性能几乎相同? (我在x86_64 GNU / Linux上使用GHC版本8.0.1)
编辑:我提交了一个错误:https://ghc.haskell.org/trac/ghc/ticket/12354#ticket
答案 0 :(得分:10)
这可能主要是(尽管不是唯一的)由于Int而不是Word存在的重写规则。我之所以这么说,是因为如果我们在-fno-enable-rewrite-rules
案例中使用Int
,我们会得到一个比Word
案例更接近但不太可能的时间。
% ghc -O2 so.hs -fforce-recomp -fno-enable-rewrite-rules && time ./so
[1 of 1] Compiling Main ( so.hs, so.o )
Linking so ...
5000000050000000
./so 1.45s user 0.03s system 99% cpu 1.489 total
如果我们使用-ddump-rule-rewrites
转储重写规则并对这些规则进行区分,那么我们会看到Int
案例中的规则而不是Word
案例:
Rule: fold/build
Before: GHC.Base.foldr
...
该特定规则在Base 4.9 GHC.Base第823行(N.B.我实际上自己使用GHC 7.10)并没有明确提及Int
。我很好奇为什么它不会被Word
解雇,但现在还没有时间进一步调查。
答案 1 :(得分:2)
正如dfeuer在此处的评论中指出的那样,Enum
的{{1}}实例优于Int
的{{1}}实例:
Word
:
Int
现在instance Enum Int where
{-# INLINE enumFromTo #-}
enumFromTo (I# x) (I# y) = eftInt x y
{-# RULES
"eftInt" [~1] forall x y. eftInt x y = build (\ c n -> eftIntFB c n x y)
"eftIntList" [1] eftIntFB (:) [] = eftInt
#-}
{- Note [How the Enum rules work]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
* Phase 2: eftInt ---> build . eftIntFB
* Phase 1: inline build; eftIntFB (:) --> eftInt
* Phase 0: optionally inline eftInt
-}
{-# NOINLINE [1] eftInt #-}
eftInt :: Int# -> Int# -> [Int]
-- [x1..x2]
eftInt x0 y | isTrue# (x0 ># y) = []
| otherwise = go x0
where
go x = I# x : if isTrue# (x ==# y)
then []
else go (x +# 1#)
{-# INLINE [0] eftIntFB #-}
eftIntFB :: (Int -> r -> r) -> r -> Int# -> Int# -> r
eftIntFB c n x0 y | isTrue# (x0 ># y) = n
| otherwise = go x0
where
go x = I# x `c` if isTrue# (x ==# y)
then n
else go (x +# 1#)
-- Watch out for y=maxBound; hence ==, not >
-- Be very careful not to have more than one "c"
-- so that when eftInfFB is inlined we can inline
-- whatever is bound to "c"
实际上使用了Word
Integer
使用
enumFromTo n1 n2 = map integerToWordX [wordToIntegerX n1 .. wordToIntegerX n2]
现在instance Enum Integer where
enumFromTo x lim = enumDeltaToInteger x 1 lim
已设置重写规则,但事实证明enumDeltaToInteger
的{{1}}从未内联,因此此设置无法在此处融合。
将此函数复制到我的测试代码中会导致GHC内联它,触发Word
规则,并严重削减分配,但是从enumFromTo
(确实分配)的转换仍然存在。