如何从Trie结构中删除单词?

时间:2011-04-03 19:38:00

标签: data-structures haskell trie

也许我不够聪明地学习Haskell,但我会给它最后一次机会。 我一直坚持从树中删除条目,Trie就像结构更具体(http://en.wikipedia.org/wiki/Trie)。

我正在寻找任何建议(不是解决方案!)如何实现这样的纯函数。 我对一种算法有所了解。通过遍历整个树“跳过”等于该单词的每个字符的值来重新创建新树,如果找不到下一个字符,则边条件返回原始树。但是当角色也属于另一个词时会出现问题。

data Trie = Trie { commonness :: Maybe Int
                 , children :: [(Char, Trie)]
                 } deriving (Eq, Read, Show)

-- Creates an empty "dictionary"
trie :: Trie
trie = Trie { commonness = Nothing, children = [] }

-- Inserts a word with given commonness into dictionary
add :: String -> Int -> Trie -> Trie
add [] freq tree
    | (0 <= freq) && (freq <= 16) = tree { commonness = Just freq }
    | otherwise = error $ "Commonness out of bounds: " ++ (show freq)
add word freq tree = tree { children = traverse word (children tree) }
    where
        traverse [] tree = error $ "traverse called with [] " ++ (show tree)
        traverse (x:xs) [] = [(x, add xs freq trie)]
        traverse str@(x:xs) (t:ts)
            | x == fst t = (x, add xs freq $ snd t):ts
            | otherwise = t:(traverse str ts)

remove :: String -> Trie -> Trie
???

数据看起来像:

GHCi> putStrLn $ groom $ add "learn" 16 $ add "leap" 5 $ add "sing" 7 $ add "lift" 10 trie

Trie{commonness = Nothing,
     children =
       [('l',
         Trie{commonness = Nothing,
              children =
                [('i',
                  Trie{commonness = Nothing,
                       children =
                         [('f',
                           Trie{commonness = Nothing,
                                children = [('t', Trie{commonness = Just 10, children = []})]})]}),
                 ('e',
                  Trie{commonness = Nothing,
                       children =
                         [('a',
                           Trie{commonness = Nothing,
                                children =
                                  [('p', Trie{commonness = Just 5, children = []}),
                                   ('r',
                                    Trie{commonness = Nothing,
                                         children =
                                           [('n',
                                             Trie{commonness = Just 16, children = []})]})]})]})]}),
        ('s',
         Trie{commonness = Nothing,
              children =
                [('i',
                  Trie{commonness = Nothing,
                       children =
                         [('n',
                           Trie{commonness = Nothing,
                                children =
                                  [('g', Trie{commonness = Just 7, children = []})]})]})]})]}

2 个答案:

答案 0 :(得分:4)

如果您为子表使用Map Char Trie而不是[(Char,Trie)],这会更容易。这就是我要为这个答案所假设的。我会让你开始介绍感应案例:

import qualified Data.Map as Map

remove :: String -> Trie -> Trie
remove (c:cs) t = t { children = Map.alter remove' c (children t) }
    where
    remove' (Just t) = Just (remove cs t)
    remove' Nothing = Nothing
remove [] t = ...

我会把基础案件留给你。以下是我使用的Map函数的文档alter。如果您为Map实施了alter,则可以在不使用[(Char,a)]的情况下获得相同的解决方案。

练习: remove'非常罗嗦。看看您是否可以使用fmap缩短它。

答案 1 :(得分:0)

在Python中,您可以执行以下操作

def remove_string_helper(self, string, pnode, index):
    if pnode:
        flag = False
        if index < len(string):
            flag = self.remove_string_helper(string, pnode.childs.get(string[index]), index + 1)

        if index == len(string) and pnode.is_complete_word:
            pnode.is_complete_word = False
            return len(pnode.childs) == 0

        if flag:
            pnode.childs.pop(string[index])
            return len(self.childs) == 0

    return False

def remove_string(self, string):
    self.remove_string_helper(string, self.childs.get(string[0]), 1)