Haskell中的简单搜索树:为什么堆栈溢出?

时间:2014-12-17 13:17:30

标签: haskell tree lazy-evaluation evaluation

我是Haskell的新手并且正在努力学习。我决定写一个简短的(不平衡的)二进制搜索树代码来开始。它将文本分成单词,将单词添加到二叉树中(丢弃重复),然后遍历树以打印出文本中排序的单词列表。

data BinTree t = ExternalNode
               | InternalNode (BinTree t) t (BinTree t)

treeInsert :: Ord t => BinTree t -> t -> BinTree t
treeInsert ExternalNode                     w = InternalNode ExternalNode w ExternalNode
treeInsert tree@(InternalNode left v right) w
  | w == v    = tree
  | w < v     = InternalNode (treeInsert left w) v right
  | otherwise = InternalNode left v (treeInsert right w)

treeFromList :: Ord t => [t] -> BinTree t
treeFromList l = go ExternalNode l
  where
    go acc []       = acc
    go acc (x : xs) = acc `seq` go (treeInsert acc x) xs

inOrderList :: BinTree t -> [t]
inOrderList ExternalNode                = []
inOrderList (InternalNode left v right) = (inOrderList left) ++ [ v ] ++ (inOrderList right)

main :: IO ()
main = do
  tmp <- readFile "words.txt"
  printList . inOrderList . treeFromList $ words tmp

  where
    printList []       = return ()
    printList (x : xs) = do
      putStrLn x
      printList xs

该程序适用于小文本。然后我喂了詹姆斯国王圣经。它崩溃抱怨堆栈大小太小。我必须将堆栈大小增加到200M以使其工作!

我的错误在哪里?我想它可能与懒惰的评估混乱的东西有关。在任何情况下,问题都不在于二叉搜索树的深度,对于圣经示例而言只有163。

1 个答案:

答案 0 :(得分:2)

问题是你正在构建太深嵌套的thunk。

此版本在seq中添加了treeInsert次调用,以强制在树的每个级别进行评估 并且可以在非常小的堆栈中运行:

import System.Environment
import Control.Monad

data BinTree t = ExternalNode
               | InternalNode (BinTree t) !t (BinTree t)

treeInsert :: Ord t => BinTree t -> t -> BinTree t
treeInsert ExternalNode                     w = InternalNode ExternalNode w ExternalNode
treeInsert tree@(InternalNode left v right) w
  | w == v    = tree
  | w < v     = let t = treeInsert left w  in t `seq` InternalNode t v right
  | otherwise = let t = treeInsert right w in t `seq` InternalNode left v t

treeFromList :: Ord t => [t] -> BinTree t
treeFromList l = go ExternalNode l
  where
    go acc []       = acc
    go acc (x : xs) = let t = treeInsert acc x in t `seq` go t xs

inOrderList :: BinTree t -> [t]
inOrderList ExternalNode                = []
inOrderList (InternalNode left v right) = (inOrderList left) ++ [ v ] ++ (inOrderList right)

main1 = do
  (arg0:_) <- getArgs
  tmp <- readFile arg0
  let t = treeFromList $ words tmp
  forM_ (inOrderList t) putStrLn

main = main1

您还可以在BinTree

的定义中使用严格注释
data BinTree t = ExternalNode | InternalNode !(BinTree t) !t !(BinTree t)

代替seq中的treeInsert来电 - 这就是Data.Set所做的。

seq中的treeFromList来电似乎没有多大影响。