我目前正在尝试了解如何在Haskell中并行编程。我正在关注Simon Peyton Jones和Satnam Singh的论文“Haskell中的并行和并发编程教程”。源代码如下:
module Main where
import Control.Parallel
import System.Time
main :: IO ()
main = do
putStrLn "Starting computation....."
t0 <- getClockTime
pseq r1 (return())
t1 <- getClockTime
putStrLn ("sum: " ++ show r1)
putStrLn ("time: " ++ show (secDiff t0 t1) ++ " seconds")
putStrLn "Finish."
r1 :: Int
r1 = parSumFibEuler 38 5300
-- This is the Fibonacci number generator
fib :: Int -> Int
fib 0 = 0
fib 1 = 1
fib n = fib (n-1) + fib (n-2)
-- Gets the euler sum
mkList :: Int -> [Int]
mkList n = [1..n-1]
relprime :: Int -> Int -> Bool
relprime x y = gcd x y == 1
euler :: Int -> Int
euler n = length $ filter (relprime n) (mkList n)
sumEuler :: Int -> Int
sumEuler = sum.(map euler).mkList
-- Gets the sum of Euler and Fibonacci (NORMAL)
sumFibEuler :: Int -> Int -> Int
sumFibEuler a b = fib a + sumEuler b
-- Gets the sum of Euler and Fibonacci (PARALLEL)
parSumFibEuler :: Int -> Int -> Int
parSumFibEuler a b =
f `par` (e `pseq`(f+e))
where
f = fib a
e = sumEuler b
-- Measure time
secDiff :: ClockTime -> ClockTime -> Float
secDiff (TOD secs1 psecs1) (TOD secs2 psecs2)
= fromInteger (psecs2 -psecs1) / 1e12 + fromInteger (secs2- secs1)
我使用以下命令编译它:
ghc --make -threaded Main.hs
a)使用1核心:
./Main +RTS -N1
b)使用2核心:
./Main +RTS -N2
然而,一个核心运行53.556秒。然而,两个核心运行73.401秒。我不明白2核如何实际运行速度慢于1核。也许传递开销的消息对于这个小程序来说太大了?与矿山相比,该论文的结果完全不同。以下是输出细节。
1核心:
Starting computation.....
sum: 47625790
time: 53.556335 seconds
Finish.
17,961,210,216 bytes allocated in the heap
12,595,880 bytes copied during GC
176,536 bytes maximum residency (3 sample(s))
23,904 bytes maximum slop
2 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 34389 colls, 0 par 2.54s 2.57s 0.0001s 0.0123s
Gen 1 3 colls, 0 par 0.00s 0.00s 0.0007s 0.0010s
Parallel GC work balance: -nan (0 / 0, ideal 1)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 0.00s ( 0.00s) 0.00s ( 0.00s)
Task 1 (worker) : 0.00s ( 53.56s) 0.00s ( 0.00s)
Task 2 (bound) : 50.49s ( 50.99s) 2.52s ( 2.57s)
SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)
INIT time 0.00s ( 0.00s elapsed)
MUT time 50.47s ( 50.99s elapsed)
GC time 2.54s ( 2.57s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 53.02s ( 53.56s elapsed)
Alloc rate 355,810,305 bytes per MUT second
Productivity 95.2% of total user, 94.2% of total elapsed
gc_alloc_block_sync: 0
whitehole_spin: 0
gen[0].sync: 0
gen[1].sync: 0
对于2核心:
Starting computation.....
sum: 47625790
time: 73.401146 seconds
Finish.
17,961,210,256 bytes allocated in the heap
12,558,088 bytes copied during GC
176,536 bytes maximum residency (3 sample(s))
195,936 bytes maximum slop
3 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 34389 colls, 34388 par 7.42s 4.73s 0.0001s 0.0205s
Gen 1 3 colls, 3 par 0.01s 0.00s 0.0011s 0.0017s
Parallel GC work balance: 1.00 (1432193 / 1429197, ideal 2)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 1.19s ( 40.26s) 16.95s ( 33.15s)
Task 1 (worker) : 0.00s ( 73.40s) 0.00s ( 0.00s)
Task 2 (bound) : 54.50s ( 68.67s) 3.66s ( 4.73s)
Task 3 (worker) : 0.00s ( 73.41s) 0.00s ( 0.00s)
SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)
INIT time 0.00s ( 0.00s elapsed)
MUT time 68.87s ( 68.67s elapsed)
GC time 7.43s ( 4.73s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 76.31s ( 73.41s elapsed)
Alloc rate 260,751,318 bytes per MUT second
Productivity 90.3% of total user, 93.8% of total elapsed
gc_alloc_block_sync: 12254
whitehole_spin: 0
gen[0].sync: 0
gen[1].sync: 0
答案 0 :(得分:7)
r1 = sumFibEuler 38 5300
我相信你的意思
r1 = parSumFibEuler 38 5300
在我的配置上(使用parSumFibEuler 45 8000
并且只有一次运行):
我怀疑fib
功能比sumEuler
更耗费CPU。这可以解释-N2的低改善。在你的情况下不会有一些偷工作的东西。
通过记忆,你的斐波纳契功能会好得多,但我认为这不是你想要尝试的。
编辑:正如评论中所提到的,我认为对于-N2,你有很多中断,因为你有两个核心可用。sum $ parMap rdeepseq (fib) [1..40]
来自here:
使用机器中的所有处理器时要小心:如果有的话 您的处理器正在被其他程序使用,这实际上可能会造成伤害 表现而不是改善它。