线程二叉树结构在Haskell中是否具有任何优势?

时间:2014-09-20 00:43:44

标签: algorithm haskell tree functional-programming

我正在阅读this mailing list post,其中有人对Haskell中的线程RB树有疑问,并且响应结束时说:

  

我建议你(Lex)要么必须(必须使用STRef或IORef)或者这样做   没有线程,除非你确定你会做更多   查找和遍历而不是插入和删除。

暗示虽然在Haskell中创建线程树通常不是一个好主意,但它仍然可以使查找和遍历更有效率,而无需采用命令式算法。

然而,我想不出线程可以在不使用命令式构造的情况下使haskell树更有效的方式。它甚至可能吗?

1 个答案:

答案 0 :(得分:1)

  

然而,我无法想到一种线程可以使得哈克尔树更多的方式   有效而无需使用命令式结构。它甚至可能吗?

从技术上讲,适用于命令式线程树的完全相同的好处也适用于持久性线程树。但是,由于这种数据结构的一些额外成本,它并不总是一个实际的选择。

考虑你有一棵树不会被修改的情况,但是你经常需要进行线性遍历(例如找到一个节点和后续的n节点,或者一个完整的线性遍历等)。在命令式语言中,线程树在这种情况下比非线程树更有效,因为线性遍历可以直接执行,而无需保持堆栈。应该清楚的是,这与持久性结构完全相同,因为我们假设树不会被修改,所以在持久性结构中,线性遍历的线程树也会更有效。 / p>

那么,持久性线程树的缺点是什么?首先,插入/删除将比普通树更昂贵,因为还需要重新创建修改后的节点之前的每个节点。因此,当突变很少或不存在时,该结构才有用。但在这种情况下,你可能最好从树中创建一个数组并遍历它(除非你想查找起始位置)。因此它最终是一个相当复杂的数据结构,只能在非常有限的情况下使用。但是对于那个非常具体的用例,它比普通的二叉树更有效。

编辑:这里是一个如何纯粹实现线程化二叉树的示例。删除的实现留作练习,没有尝试保持树的平衡,我没有做出正确的承诺。但是在使用Tree建立Prelude.foldl Threaded.insert Threaded.empty之后,Data.Foldable.toListfoldThread (:[])都会返回相同的列表,因此它可能非常接近正确。

{-# LANGUAGE DeriveFoldable #-}
module Threaded where

import Control.Applicative
import Control.Monad
import Data.Foldable (Foldable (..))
import Data.Monoid

newtype Tree a = Tree {unTree :: Maybe (NonNullTree a) }
  deriving (Eq, Foldable)

-- its a little easier to work with non-null trees.
data NonNullTree a = Bin (Link a) a (Link a)

data Link a =
    Normal (NonNullTree a)  -- a child branch
  | Thread (NonNullTree a)  -- a thread to another branch
  | Null                    -- left child of min value, or right child of max value
  -- N.B. don't try deriving type class instances, such as Eq or Show.  If you derive
  -- them, many of the derived functions will be infinite loops.  If you want instances
  -- for Show or Eq, you'll have to write them by hand and break the loops by
  -- not following Thread references.    

empty :: Tree a
empty = Tree Nothing

singleton :: a -> Tree a
singleton a = Tree . Just $ Bin Null a Null

instance Foldable NonNullTree where
    foldMap f (Bin l a r) = mconcat [foldMap f l, f a, foldMap f r]

-- when folding, we only want to follow actual children, not threads.
-- Using this instance, we can compare with folding via threads.
instance Foldable Link where
    foldMap f (Normal t) = foldMap f t
    foldMap f _ = mempty

-- |find the first value in the tree >= the search term
-- O(n) complexity, we can do better!
tlookup :: Ord a => Tree a -> a -> Maybe a
tlookup tree needle = getFirst $ foldMap search tree
  where
    search a = if a >= needle then First (Just a) else mempty

-- | fold over the tree by following the threads.  The signature matches `foldMap` for easy
-- comparison, but `foldl'` or `traverse` would likely be more common operations.
foldThread :: Monoid m => (a -> m) -> Tree a -> m
foldThread f (Tree (Just root)) = deep mempty root
  where
    -- descend to the leftmost child, then follow threads to the right.
    deep acc (Bin l a r) = case l of
        Normal tree -> deep acc tree
        _ -> follow (acc `mappend` f a) r
    follow acc (Normal tree) = deep acc tree
    -- in this case we know the left child is a thread pointing to the
    -- current node, so we can ignore it.
    follow acc (Thread (Bin _ a r)) = follow (acc `mappend` f a) r
    follow acc Null = acc

-- used internally.  sets the left child of the min node to the 'prev0' link,
-- and the right child of the max node to the 'next0' link.
relinkEnds :: Link a -> Link a -> NonNullTree a -> NonNullTree a
relinkEnds prev0 next0 root = case go prev0 next0 root of
    Normal root' -> root'
    _ -> error "relinkEnds: invariant violation"
  where
    go prev next (Bin l a r) =
        -- a simple example of knot-tying.
        -- * l' depends on 'this'
        -- * r' depends on 'this'
        -- * 'this' depends on both l' and r'
        -- the whole thing works because Haskell is lazy, and the recursive 'go'
        -- function never actually inspects the 'prev' and 'next' arguments.
        let l' = case l of
                Normal lTree -> go prev (Thread this) lTree
                _ -> prev
            r' = case r of
                Normal rTree -> go (Thread this) next rTree
                _ -> next
            this = Bin l' a r'
        in Normal this

-- | insert a value into the tree, overwriting it if already present.
insert :: Ord a => Tree a -> a -> Tree a
insert (Tree Nothing) a = singleton a
insert (Tree (Just root)) a = case go Null Null root of
    Normal root' -> Tree $ Just root'
    _ -> error "insert: invariant violation"
  where
    go prev next (Bin l val r) = case compare a val of
        LT ->
            -- ties a knot similarly to the 'relinkEnds' function.
            let l' = case l of
                      Normal lTree -> go prev thisLink lTree
                      _ -> Normal $ Bin prev a thisLink
                r' = case r of
                      Normal rTree -> Normal $ relinkEnds thisLink next rTree
                      _ -> next
                this = Bin l' val r'
                thisLink = Thread this
            in Normal this
        EQ ->
            let l' = case l of
                      Normal lTree -> Normal $ relinkEnds prev thisLink lTree
                      _ -> prev
                r' = case r of
                      Normal rTree -> Normal $ relinkEnds thisLink next rTree
                      _ -> next
                this = Bin l' a r'
                thisLink = Thread this
            in Normal this
        GT ->
            let l' = case l of
                      Normal lTree -> Normal $ relinkEnds prev thisLink lTree
                      _ -> prev
                r' = case r of
                      Normal rTree -> go thisLink next rTree
                      _ -> Normal $ Bin thisLink a next
                this = Bin l' val r'
                thisLink = Thread this
            in Normal this