如何在Haskell中实现B +树?

时间:2013-12-01 06:41:25

标签: haskell b-tree

B +树将叶节点链接在一起。将B +树的指针结构视为有向图,它不是循环的。但是忽略指针的方向并将其视为无向的,链接在一起的叶节点会在图中创建循环。

在Haskell中,如何将叶子构造为父内部节点的子节点,同时构建相邻叶节点的下一个链接。怎样才能用Haskell的代数数据类型做到这一点?似乎Haskell ADT通常使得类似循环的结构难以表达。

2 个答案:

答案 0 :(得分:15)

这是(immutable /“mutable”-by-reconstruction / zipperable)ADT表示(涉及不可变vectors)的想法:

module Data.BTree.Internal where

import Data.Vector

type Values v = Vector v

type Keys k = Vector k

data Leaf k v
  = Leaf
    { _leafKeys   :: !(Keys k)
    , _leafValues :: !(Values v)
    , _leafNext   :: !(Maybe (Leaf k v)) -- @Maybe@ is lazy in @Just@, so this strict mark
                                         -- is ok for tying-the-knot stuff.
    -- , _leafPrev   :: !(Maybe (Leaf k v))
    -- ^ for doubly-linked lists of leaves
    }

type Childs k v = Vector (BTree k v)

data Node k v
  = Node
    { _nodeKeys   :: !(Keys k)
    , _nodeChilds :: !(Childs k v)
    }

data BTree k v
  = BTreeNode !(Node k v)
  | BTreeLeaf !(Leaf k v)

newtype BTreeRoot k v
  = BTreeRoot (BTree k v)
  • 这应该是内部的,因此不正确使用原始构造函数,访问器或模式匹配不会破坏树。

  • KeysValuesChilds可以添加长度控制(使用运行时检查或可能使用GADT等)。

对于界面:

module Data.BTree ( {- appropriate exports -} ) where

import Data.Vector
import Data.BTree.Internal

-- * Building trees: "good" constructors.

keys :: [k] -> Keys k
keys = fromList

values :: [v] -> Values v
values = fromList

leaves :: [Leaf k v] -> Childs k v
leaves = fromList . fmap BTreeLeaf

leaf :: Keys k -> Values v -> Maybe (Leaf k v) -> Leaf k v
-- or
-- leaf :: Keys k -> Values v -> Maybe (Leaf k v) -> Maybe (Leaf k v) -> Leaf k v
-- for doubly-linked lists of leaves
leaf = Leaf

node :: Keys k -> Childs k v -> BTree k v
node ks = BTreeNode . Node ks

-- ...

-- * "Good" accessors.

-- ...

-- * Basic functions: insert, lookup, etc.

-- ...

然后是这种树:

B+tree example

可以构建为

test :: BTree Int ByteString
test = let
  root  = node (keys [3, 5]) (leaves [leaf1, leaf2, leaf3])
  leaf1 = leaf (keys [1, 2]) (values ["d1", "d2"]) (Just leaf2)
  leaf2 = leaf (keys [3, 4]) (values ["d3", "d4"]) (Just leaf3)
  leaf3 = leaf (keys [5, 6, 7]) (values ["d5", "d6", "d7"]) Nothing
  in root

这种技术称为"tying the knot"。叶子可循环使用:

  leaf1 = leaf (keys [1, 2]) (values ["d1", "d2"]) (Just leaf2)
  leaf2 = leaf (keys [3, 4]) (values ["d3", "d4"]) (Just leaf3)
  leaf3 = leaf (keys [5, 6, 7]) (values ["d5", "d6", "d7"]) (Just leaf1)

或双重关联(假设_leafPrev和相应的leaf函数):

  leaf1 = leaf (keys [1, 2]) (values ["d1", "d2"]) (Just leaf2) (Just leaf3)
  leaf2 = leaf (keys [3, 4]) (values ["d3", "d4"]) (Just leaf3) (Just leaf1)
  leaf3 = leaf (keys [5, 6, 7]) (values ["d5", "d6", "d7"]) (Just leaf1) (Just leaf2)

mutable vectorsmutable references

也可以实现完全可变的表示
type Values v = IOVector v

type Keys k = IOVector k

type Childs k v = IOVector (BTree k v)

    , _leafNext   :: !(IORef (Maybe (Leaf k v)))

依此类推,基本相同,但使用IORefIOVector,在IO monad中工作。

答案 1 :(得分:2)

也许这与你要找的相似?

data Node key value
    = Empty
    | Internal key [Node key value] -- key and children
    | Leaf value (Node key value) -- value and next-leaf
    deriving Show

let a = Leaf 0 b
    b = Leaf 1 c
    c = Leaf 2 d
    d = Leaf 3 Empty
in  Internal [Internal 0 [a,b], Internal 2 [c,d]]

此定义的一个问题是它不会阻止Leaf节点中的next-leaf成为Internal节点。

使用Haskell制作循环结构实际上很容易,甚至是无限的。例如,以下是无限的零列表,它是循环的。

let a = 0:a

你甚至可以进行相互递归,这更加循环:

let a = 0:b
    b = 1:a
in  a