Question

我已经实现了两个类的DS Trie：Trie＆amp; TrieNode。我需要编写一个函数，在O（h）中返回Trie中最长的字符串。我的TrieNode有一个字段LinkedList，用于存储每个节点的子节点。我们还没有了解BFS或DFS，所以我想考虑一些创造性的解决方法。

我已经有一个函数（一个SEPARATE函数），它通过给定的char插入/创建一个新节点：构建Trie时：创建一个字段'maxDepth = 0'的节点，指示我当前的深度。对于我创建的每个新节点，我将一直迭代到他的父节点（每个节点已经有一个指向他父节点的指针），依此类推，直到我到达根节点，并将父节点的深度增加1。现在我将通过这种方式创建返回最长字符串的函数：对于每个节点：遍历我的孩子，查找最大整数'maxDepth'而不是下降。这样做直到你达到'maxDepth == 0'。例如，我的算法将适用于此字符串：“aacgace”

       root      
       / \
   (2)a   g(0)     
     / 
 (1)c        
   / 
(0)e

=＆GT; 'ace'实际上是最长的。但是对于这个字符串不会很好：“aacgae”

      root      
      /  \
   (2)a   g(0)     
    /  \
 (0)c  (0)e

=＆GT;似乎Node'a'有一个孩子，他的孩子也有孩子，但事实并非如此。

一般情况下，我正在尝试使用创建Trie的第一个函数（运行时间：O（h * c）），因此第二个函数（返回最长字符串）的运行时间将会减少我可以。 O（1H）

Answer 1

不确定你真正想做什么，但你可以找到一个特里here的例子。

基本上我通过一个建造者来创造特里;让我们快速了解如何将单词添加到trie：

// In TrieBuilder
final TrieNodeBuilder nodeBuilder = new TrieNodeBuilder();

// ...

/**
 * Add one word to the trie
 *
 * @param word the word to add
 * @return this
 * @throws IllegalArgumentException word is empty
 */
public TrieBuilder addWord(@Nonnull final String word)
{
    Objects.requireNonNull(word);

    final int length = word.length();

    if (length == 0)
        throw new IllegalArgumentException("a trie cannot have empty "
            + "strings (use EMPTY instead)");
    nrWords++;
    maxLength = Math.max(maxLength, length);
    nodeBuilder.addWord(word);
    return this;
}

这推迟将这个词添加到TrieNodeBuilder中，它执行此操作：

private boolean fullWord = false;

private final Map<Character, TrieNodeBuilder> subnodes
    = new TreeMap<>();

TrieNodeBuilder addWord(final String word)
{
    doAddWord(CharBuffer.wrap(word));
    return this;
}

/**
 * Add a word
 *
 * <p>Here also, a {@link CharBuffer} is used, which changes position as we
 * progress into building the tree, character by character, node by node.
 * </p>
 *
 * <p>If the buffer is "empty" when entering this method, it means a match
 * must be recorded (see {@link #fullWord}).</p>
 *
 * @param buffer the buffer (never null)
 */
private void doAddWord(final CharBuffer buffer)
{
    if (!buffer.hasRemaining()) {
        fullWord = true;
        return;
    }

    final char c = buffer.get();
    TrieNodeBuilder builder = subnodes.get(c);
    if (builder == null) {
        builder = new TrieNodeBuilder();
        subnodes.put(c, builder);
    }
    builder.doAddWord(buffer);
}

假设我们将“麻烦”和“麻烦”添加到特里;会发生什么：

第一次为每个“麻烦”字符创建节点;
第二次，直到“l”存在的所有节点;然后为“ing”创建所有节点。

现在，如果我们添加“麻烦”，将在“e”之后为“s”创建另一个节点。

fullWord变量告诉我们这里是否有潜在的完全匹配;这是搜索功能：

public final class Trie
{
    private final int nrWords;
    private final int maxLength;
    private final TrieNode node;

    // ...

    /**
     * Search for a string into this trie
     *
     * @param needle the string to search
     * @return the length of the match (ie, the string) or -1 if not found
     */
    public int search(final String needle)
    {
        return node.search(needle);
    }
    // ...
}

在TrieNode我们有：

public final class TrieNode
{
    private final boolean fullWord;

    private final char[] nextChars;
    private final TrieNode[] nextNodes;

    // ...

    public int search(final String needle)
    {
        return doSearch(CharBuffer.wrap(needle), fullWord ? 0 : -1, 0);
    }

    /**
     * Core search method
     *
     * <p>This method uses a {@link CharBuffer} to perform searches, and changes
     * this buffer's position as the match progresses. The two other arguments
     * are the depth of the current search (ie the number of nodes visited
     * since root) and the index of the last node where a match was found (ie
     * the last node where {@link #fullWord} was true.</p>
     *
     * @param buffer the charbuffer
     * @param matchedLength the last matched length (-1 if no match yet)
     * @param currentLength the current length walked by the trie
     * @return the length of the match found, -1 otherwise
     */
    private int doSearch(final CharBuffer buffer, final int matchedLength,
        final int currentLength)
    {
        /*
         * Try and see if there is a possible match here; there is if "fullword"
         * is true, in this case the next "matchedLength" argument to a possible
         * child call will be the current length.
         */
        final int nextLength = fullWord ? currentLength : matchedLength;


        /*
         * If there is nothing left in the buffer, we have a match.
         */
        if (!buffer.hasRemaining())
            return nextLength;

        /*
         * OK, there is at least one character remaining, so pick it up and see
         * whether it is in the list of our children...
         */
        final int index = Arrays.binarySearch(nextChars, buffer.get());

        /*
         * If not, we return the last good match; if yes, we call this same
         * method on the matching child node with the (possibly new) matched
         * length as an argument and a depth increased by 1.
         */
        return index < 0
            ? nextLength
            : nextNodes[index].doSearch(buffer, nextLength, currentLength + 1);
    }
}

注意在doSearch()的第一次调用中-1如何作为“nextLength”参数传递。

假设我们有一个带有上述三个单词的trie，这里是搜索“tr”的调用序列，它失败了：

doSearch（“tr”，-1,0）（节点为root）;
doSearch（“tr”，-1,1）（节点为't'）;
doSearch（“tr”， - 1,2）（节点为'r'）;
没有下一个字符：return nextLength; nextLength是-1，不匹配。

现在，如果我们遇到“麻烦”：

doSearch（“麻烦”，-1,0）（节点是root）;
doSearch（“麻烦”，-1,1）（节点为't'）;
doSearch（“麻烦”，-1,2）（节点为'r'）;
doSearch（“麻烦”，-1,3）（节点为'o'）;
doSearch（“麻烦”，-1,4）（节点是'u'）;
doSearch（“麻烦”，-1,5）（节点为'b'）;
doSearch（“麻烦”，-1,6）（节点为'l'）;
doSearch（“麻烦”，-1,7）（节点为'e'）;
doSearch（“麻烦”，7,8）（全字是真的！节点是's'）;
没有下一个字符：return nextLength，即8;我们有一场比赛。

Answer 2

嗯，你正在以正确的方式思考 - 如果你想在没有遍历整个树的情况下找到最长的字符串，你必须在构建树时存储一些信息。
假设对于节点i，我们将最大长度存储在max_depth[i]中，并且我们记住其子节点的最大长度为max_child[i]。因此，对于您插入到trie中的每个新单词，请记住您插入的最后一个节点（也是一个新叶，表示字符串的最后一个字符），执行以下操作：

current = last_inserted_leaf
while (current != root):
    if max_depth[parent[current]] < max_depth[current] + 1:
        max_depth[parent[current]] = max_depth[current] + 1
        max_child[parent[current]] = current
    current = parent[current]

现在，要输出最长的字符串，只需执行以下操作：

current = root
while is_not_leaf(current):
    answer += char_of_child[max_child[current]]
    current = max_child[current]
return answer

因此，插入需要2*n = O(n)个操作，找到最长的字符串需要O(h)，其中h是最长字符串的长度。

然而，上述算法需要O(n)额外的内存，而且太多了。最简单的方法是存储max_string，并且每次向trie添加字符串时，只需比较new_string的长度和max_string的长度，如果新长度更大，然后分配max_string = new_string。它将占用更少的内存，而最长的字符串只能在O(1)中找到。

如何在TRIE中找到最长的字符串

2 个答案: