为什么这个Haskell程序表现如此糟糕?

时间:2013-12-31 01:19:53

标签: performance haskell

我是一名Haskell新手,我很遗憾这个程序表现如此糟糕。我尝试在各个地方强制使用严格的变量,但似乎没有什么区别。

这是我的代码(该程序的目的是产生从标准输入中找到的输入字节的频率):

{-# LANGUAGE BangPatterns #-}

import Control.Concurrent (forkIO, killThread)
import Control.Concurrent.MVar
import qualified Data.IntMap as IntMap
import Data.IntMap.Strict (IntMap)
import Control.Monad.Fix
import Control.Monad (when)
import qualified Data.Char as Char
import qualified System.IO as IO
import System.IO (hSetBinaryMode, hFlush)
import Data.List as List
import Text.PrettyPrint.Boxes as Boxes
import Text.Printf (printf)
import Data.Function

data BFreq = BFreq Integer (IntMap Integer)

main :: IO ()
main = do
  putStrLn "analyze data from stdin"
  hSetBinaryMode IO.stdin True
  mv <- newEmptyMVar
  tid <- forkIO $ statusUpdater mv
  bf <- run mv
  killThread tid
  displayResults bf

resultTable :: [[String]] -> Box
resultTable rows =
  Boxes.hsep 4 Boxes.left boxed_cols
  where
    cols       = transpose rows
    boxed_cols = map (Boxes.vcat Boxes.left . map text) cols

displayResults :: BFreq -> IO ()
displayResults (BFreq n counts) = do
  putStrLn $ "read " ++ (show n) ++ " bytes"
  when (n > 0) (displayFreqs n counts)

displayFreqs :: Integer -> IntMap Integer -> IO ()
displayFreqs n counts =
  do
    putStrLn "frequencies:"
    Boxes.printBox $ resultTable rows
  where
    cmp x y       = compare (snd y) (snd x)
    sorted_counts = List.sortBy cmp $ IntMap.assocs counts

    intdiv :: Integer -> Integer -> Float
    intdiv a b = (fromIntegral a) / (fromIntegral b)

    percent y    = printf "%.2f" (100*intdiv y n)
    show_byte x  = (show $ Char.chr x) ++ " (" ++ (show x) ++ "):"
    show_count y = (percent y) ++ "% (" ++ (show y) ++ ")"

    rows = map (\(x,y) -> [show_byte x, show_count y]) sorted_counts


run :: MVar Integer -> IO BFreq
run mv = 
  fn mv 0 IntMap.empty 
  where
    fn mv !n !mp =
      do
        tryPutMVar mv n
        eof <- IO.isEOF
        if eof
          then return $ BFreq n mp
          else do
            b <- getChar
            fn mv (1+n) (new_map b)
      where
        k x       = Char.ord x
        old_val x = IntMap.findWithDefault 0 (k x) mp
        new_map x = IntMap.insert (k x) ((old_val x)+1) mp

statusUpdater :: MVar Integer -> IO ()
statusUpdater mv = 
  do
    takeMVar mv >>= print_progress
    statusUpdater mv
  where
    print_progress n = 
      do
        putStr $ "\rbytes: "
        when (gbs > 0) $ putStr $ (show gbs) ++ " GBs "
        when (mbs > 0) $ putStr $ (show mbs) ++ " MBs "
        when (kbs > 0) $ putStr $ (show kbs) ++ " KBs "
        when (gbs < 1 && mbs < 1 && kbs < 1) $ putStr $ (show bs) ++ " Bs "
        hFlush IO.stdout
      where
        (gbs, gbr)   = quotRem n 0x40000000
        (mbs, mbr)   = quotRem gbr 0x100000
        (kbs, bs)    = quotRem mbr 0x400

这是我运行时会发生什么(注意:我正在使用-O2进行编译):

$> cabal build -v                                                                                             
creating dist/build                                                                                                                       
creating dist/build/autogen                                                                                                                 
Building bfreq-0.1.0.0...                                                                                                                   
Preprocessing executable 'bfreq' for bfreq-0.1.0.0...                                                                                       
Building executable bfreq...                                                                                                                  
creating dist/build/bfreq                                                                                                                     
creating dist/build/bfreq/bfreq-tmp                                                                                                           
/usr/bin/ghc --make -o dist/build/bfreq/bfreq -hide-all-packages -fbuilding-cabal-package -package-conf dist/package.conf.inplace -i -idist/build/bfreq/bfreq-tmp -i. -idist/build/autogen -Idist/build/autogen -Idist/build/bfreq/bfreq-tmp -optP-include -optPdist/build/autogen/cabal_macros.h -odir dist/build/bfreq/bfreq-tmp -hidir dist/build/bfreq/bfreq-tmp -stubdir dist/build/bfreq/bfreq-tmp -package-id base-4.5.0.0-40b99d05fae6a4eea95ea69e6e0c9702 -package-id boxes-0.1.3-e03668bca38fe3e879f9d695618ddef3 -package-id containers-0.5.3.1-80819105034e34d03d22b1c20d6fd868 -O -O2 -rtsopts -XHaskell98 ./bfreq.hs
[1 of 1] Compiling Main             ( bfreq.hs, dist/build/bfreq/bfreq-tmp/Main.o )
Linking dist/build/bfreq/bfreq ...
$> cat /dev/urandom | head -c 9999999 > test_data
$> cat ./test_data | ./dist/build/bfreq/bfreq +RTS -sstderr
analyze data from stdin
bytes: 9 MBs 521 KBs read 9999999 bytes
frequencies:
'\137' (137):    0.40% (39642)
'H' (72):        0.40% (39608)
<...>
'L' (76):        0.39% (38617)
'\246' (246):    0.39% (38609)
'I' (73):        0.38% (38462)
'q' (113):       0.38% (38437)
   9,857,106,520 bytes allocated in the heap
  14,492,245,840 bytes copied during GC
   3,406,696,360 bytes maximum residency (13 sample(s))
      14,691,672 bytes maximum slop
            6629 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0     18348 colls,     0 par   10.90s   10.90s     0.0006s    0.0180s
  Gen  1        13 colls,     0 par   15.20s   19.65s     1.5119s    12.6403s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time   14.45s  ( 14.79s elapsed)
  GC      time   26.10s  ( 30.56s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time   40.55s  ( 45.35s elapsed)

  %GC     time      64.4%  (67.4% elapsed)

  Alloc rate    682,148,818 bytes per MUT second

  Productivity  35.6% of total user, 31.9% of total elapsed

因此,除非我误解上述调试输出,否则我的程序使用6 GB? 测试数据小于10 MB,所以发生了什么?

关于如何在Haskell中处理这样的问题的任何一般建议也会很好。换句话说,我应该为这种以I / O为中心的东西避免使用Haskell吗?我应该使用管道库来做这种事吗?

编辑: 感谢您的帮助,正确导入严格版本的IntMap可以解决内存问题。

我无法使分析(-fprof-auto)工作,因为似乎没有编译我的软件包进行分析。我通过为我的操作系统(ubuntu:ghc-prof)安装ghc profiling包解决了缺少性能分析库的问题,但根据this,我需要手动重新安装所有的haskell库以进行性能分析。我目前没有时间这样做,所以我只是把这个链接放在这里,以免造成类似问题的人。

1 个答案:

答案 0 :(得分:12)

如果按照the GHC guide chapter on profiling使用-fprof-auto进行编译,则会在run.fn.new_maprun.fn中看到大量分配。

有问题的代码:

new_map x = IntMap.insert (k x) ((old_val x)+1) mp

怀疑:((old_val x)+1)正在创造一连串未经评估的thunk。拟议的变更:

new_map x = let ov  = old_val x + 1 in
            ov `seq` IntMap.insert (k x) ov mp

瞧!分配,GC和内存使用都一路下来。

编辑:您可能打算import qualified Data.IntMap.Strict as IntMap,不需要进行此更改。