我刚开始学习并发编程 在这种情况下,线程案例使用更多内存(当然)。
但是,线程案例分配较少的堆然后是非线程程序
这不是特例吗?
对于使用-N1
选项编译的程序,使用-threaded
选项效果不佳?
我无法理解为什么线程案例分配较少的堆。 在我的猜测中,切换线程会产生这种情况。
你能解释一下这种情况吗?
我认为这个程序并不特别,并且输入很少(时间表)。
import Control.Concurrent
import Control.Monad
import System.IO
{-
This code hangs when "Start scheduler" when compile with option "-O*" without `hFlush stdout`.
This problem is described as another question
(http://stackoverflow.com/questions/33311980).
In this code, the problem isn't fixed to weight processing load.
Please do not compile this code with optimize option.
-}
main = do
count <- newMVar 0
clock <- newMVar 0
finished <- newMVar 0
out <- newMVar ()
mapM_ (\thID -> forkIO $ aThread thID count out clock finished) [0..(length schedules - 1)]
wmPutStrLn out "Start Scheduler"
scheduler out count clock finished (length schedules)
where
scheduler out c clk fin thNum = loop
where loop = do
finished <- readMVar fin
if finished == thNum
then do
return ()
else do
now <- readMVar c
when (now == thNum - finished) $ do
_ <- swapMVar c 0 -- I don't like this.
modifyMVar_ clk (return .(+1))
nClk <- readMVar clk
wmPutStrLn out $ "Clock up: " ++ show nClk
loop
-- A thread, I think you don't care about this. Just a processing load.
aThread thID c out clk fin = do
wmPutStrLn out "A thread is initialized"
loop (schedules !! thID) 0
where
loop aSch pClk =
if null aSch
then do
modifyMVar_ fin (return . (+1))
return ()
else do
now <- readMVar clk
if pClk == now
then do
modifyMVar_ c (return . (+1))
let (t,aMsg) = head $ aSch
if now == t
then do
wmPutStrLn out aMsg
loop (tail aSch) (pClk+1)
else do
loop aSch (pClk+1)
else loop aSch pClk
schedules :: [[(Int,String)]]
schedules =
[ [(1,"A1"),(2,"A2")]
, [(0,"B0"),(3,"B3")]
, [(1,"C1"),(2,"C2"),(3,"C3")]
, [(0,"D0"),(2,"D2")]
, [(0,"E0"),(1,"E1"),(2,"E2")]
, [(5,"F5")]
, []
, [(4,"H4"),(6,"H6"),(8,"H8")]]
wmPutStrLn mv str = withMVar mv $ \_ -> putStrLn str
比较这些输出:
$ ./Scheduler04.toAsk +RTS -s -N1 > /dev/null
3,446,295,648 bytes allocated in the heap
4,112,880 bytes copied during GC
56,776 bytes maximum residency (2 sample(s))
21,048 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 6636 colls, 0 par 0.020s 0.024s 0.0000s 0.0007s
Gen 1 2 colls, 0 par 0.000s 0.000s 0.0001s 0.0001s
TASKS: 4 (1 bound, 3 peak workers (3 total), using -N1)
SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)
INIT time 0.000s ( 0.000s elapsed)
MUT time 1.246s ( 1.266s elapsed)
GC time 0.020s ( 0.024s elapsed)
EXIT time 0.000s ( 0.000s elapsed)
Total time 1.267s ( 1.291s elapsed)
Alloc rate 2,766,218,149 bytes per MUT second
Productivity 98.4% of total user, 96.5% of total elapsed
gc_alloc_block_sync: 0
whitehole_spin: 0
gen[0].sync: 0
gen[1].sync: 0
$ ./Scheduler04.toAsk +RTS -s -N2 > /dev/null
1,027,098,440 bytes allocated in the heap
1,070,584 bytes copied during GC
70,000 bytes maximum residency (2 sample(s))
34,568 bytes maximum slop
2 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 1048 colls, 1048 par 0.060s 0.005s 0.0000s 0.0001s
Gen 1 2 colls, 1 par 0.000s 0.000s 0.0001s 0.0001s
Parallel GC work balance: 22.49% (serial 0%, perfect 100%)
TASKS: 6 (1 bound, 5 peak workers (5 total), using -N2)
SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)
INIT time 0.000s ( 0.000s elapsed)
MUT time 0.494s ( 0.285s elapsed)
GC time 0.060s ( 0.005s elapsed)
EXIT time 0.000s ( 0.000s elapsed)
Total time 0.555s ( 0.291s elapsed)
Alloc rate 2,079,458,137 bytes per MUT second
Productivity 89.2% of total user, 170.2% of total elapsed
gc_alloc_block_sync: 20
whitehole_spin: 0
gen[0].sync: 0
gen[1].sync: 0
$ ./Scheduler04.toAsk +RTS -s -N4 > /dev/null
277,518,568 bytes allocated in the heap
597,480 bytes copied during GC
96,400 bytes maximum residency (2 sample(s))
47,648 bytes maximum slop
3 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 297 colls, 297 par 0.247s 0.002s 0.0000s 0.0001s
Gen 1 2 colls, 1 par 0.000s 0.000s 0.0001s 0.0002s
Parallel GC work balance: 34.57% (serial 0%, perfect 100%)
TASKS: 10 (1 bound, 9 peak workers (9 total), using -N4)
SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)
INIT time 0.000s ( 0.001s elapsed)
MUT time 0.178s ( 0.109s elapsed)
GC time 0.247s ( 0.002s elapsed)
EXIT time 0.000s ( 0.000s elapsed)
Total time 0.427s ( 0.112s elapsed)
Alloc rate 1,561,830,669 bytes per MUT second
Productivity 41.9% of total user, 159.5% of total elapsed
gc_alloc_block_sync: 106
whitehole_spin: 0
gen[0].sync: 0
gen[1].sync: 1