对于Trie中的所有路径,获取路径中最大数目的结束词

时间:2019-05-20 08:23:54

标签: python trie

请考虑以下单词:

'a', 'ab', 'abcd', 'b', 'bcd'

将它们添加到Trie中将导致以下带有星号的表示形式,这意味着节点是结束词:

              root 
              / \
            *a  *b
            /     \
          *b       c
          /         \
         c          *d
        /
      *d

在此示例中,我们有两条路径,并且任何路径中的结尾词的最大数量为3(a,ab,abcd)。您将如何执行DFS以获得最大价值?

这是我的Trie代码:

class TrieNode:

    def __init__(self):
        self.children = dict()
        self.end_word = 0

class Trie:

    def __init__(self):
        self.root = TrieNode()

    def insert(self, key):
        current = self.root
        for char in key:
            if char not in current.children:
                current.children[char] = TrieNode()
            current = current.children[char]
        current.end_word += 1

2 个答案:

答案 0 :(得分:1)

您应该在您的TrieNode中添加一个方法,如果我对您的问题很了解,那么您需要这个尝试:

          root 
          / \
        *a  *b
        /     \
      *b       c
      / \       \
     c  *d      *d
    /   /
  *d   *e

返回4(a,ab,abd,abde)

您可以递归地做到这一点:

class TrieNode:

    def __init__(self):
        self.children = dict()
        self.end_word = 0

    def count_end_words(self):
        if self.children:
            return self.end_word + max(child.count_end_words() for child in self.children.values())
        return self.end_word

class Trie:

    def __init__(self):
        self.root = TrieNode()

    def insert(self, key):
        current = self.root
        for char in key:
            if char not in current.children:
                current.children[char] = TrieNode()
            current = current.children[char]
        current.end_word += 1

    def max_path_count_end_words(self):
        return self.root.count_end_words()

root = Trie()
for word in ('a', 'ab', 'abcd', 'b', 'bcd', 'abd', 'abde'):
     root.insert(word)

print(root.max_path_count_end_words()) # returns 4

如评论中所述,您可以避免创建类TrieNode,这是一种实现方法:

class Trie:

    def __init__(self):
        self.children = dict()
        self.is_end_word = False

    def insert(self, key):
        current = self
        if not key:
            return
        if len(key) == 1:
            self.is_end_word = True
        char  = key[0]
        if char not in current.children:
            self.children[char] = Trie()
        return self.children[char].insert(key[1:])

    def max_path_count_end_words(self):
        if self.children:
            return self.is_end_word + max(child.max_path_count_end_words() for child in self.children.values())
        return self.is_end_word

root = Trie()
for word in ('a', 'ab', 'abcd', 'b', 'bcd', 'abd', 'abde'):
     root.insert(word)

print(root.max_path_count_end_words()) # returns 4

答案 1 :(得分:0)

您可以在TrieNode类上添加一个方法,该方法返回是否为end_word加上所有子代上调用同一方法的总和:

class TrieNode:
    def __init__(self):
        self.children = dict()
        self.end_word = 0
    def word_count(self):
        return self.end_word + sum([word_count() for c in self.children.values() ])

要使用它:

t = Trie()
for w in ('a', 'ab', 'abcd', 'b', 'bcd'):
    t.insert(w)

t.root.word_count()
# 5

要找到最大值,只需在所有根孩子上调用它并进行比较:

max(branch.words() for branch in t.root.children.values())
# 3