获取Trie中的单词列表

时间:2010-05-08 14:14:17

标签: java trie

我希望使用以下代码来检查Trie中是否存在单词匹配,但是返回列表所有以用户输入的前缀开头的单词。有人能指出我正确的方向吗?我根本无法工作.....

public boolean search(String s)
{
    Node current = root;
    System.out.println("\nSearching for string: "+s);

    while(current != null)
    {
        for(int i=0;i<s.length();i++)
        {               
            if(current.child[(int)(s.charAt(i)-'a')] == null)
            {
                System.out.println("Cannot find string: "+s);
                return false;
            }
            else
            {
                current = current.child[(int)(s.charAt(i)-'a')];
                System.out.println("Found character: "+ current.content);
            }
        }
        // If we are here, the string exists.
        // But to ensure unwanted substrings are not found:

        if (current.marker == true)
        {
            System.out.println("Found string: "+s);
            return true;
        }
        else
        {
            System.out.println("Cannot find string: "+s +"(only present as a substring)");
            return false;
        }
    }

    return false; 
}

}

11 个答案:

答案 0 :(得分:7)

我在尝试制作文本自动完成模块时遇到了这个问题。我通过创建一个Trie解决了这个问题,其中每个节点都包含它的父节点以及子节点。首先,我从输入前缀开始搜索节点。然后我在Trie上应用了一个遍历,它以子根作为前缀节点探索子树的所有节点。无论何时遇到叶节点,都意味着找到了从输入前缀开始的单词的结尾。从该叶节点开始,我遍历父节点获取父节点的父节点,并到达子树的根节点。在这样做的同时,我不断在堆栈中添加节点的键。最后我拿了前缀并开始通过弹出堆栈来附加它。我继续保存在ArrayList中的单词。在遍历结束时,我得到从输入前缀开始的所有单词。以下是带有用法示例的代码:

class TrieNode
{
    char c;
    TrieNode parent;
    HashMap<Character, TrieNode> children = new HashMap<Character, TrieNode>();
    boolean isLeaf;

    public TrieNode() {}
    public TrieNode(char c){this.c = c;}
}

-

public class Trie
{
    private TrieNode root;
    ArrayList<String> words; 
    TrieNode prefixRoot;
    String curPrefix;

    public Trie()
    {
        root = new TrieNode();
        words  = new ArrayList<String>();
    }

    // Inserts a word into the trie.
    public void insert(String word) 
    {
        HashMap<Character, TrieNode> children = root.children;

        TrieNode crntparent;

        crntparent = root;

        //cur children parent = root

        for(int i=0; i<word.length(); i++)
        {
            char c = word.charAt(i);

            TrieNode t;
            if(children.containsKey(c)){ t = children.get(c);}
            else
            {
            t = new TrieNode(c);
            t.parent = crntparent;
            children.put(c, t);
            }

            children = t.children;
            crntparent = t;

            //set leaf node
            if(i==word.length()-1)
                t.isLeaf = true;    
        }
    }

    // Returns if the word is in the trie.
    public boolean search(String word)
    {
        TrieNode t = searchNode(word);
        if(t != null && t.isLeaf){return true;}
        else{return false;}
    }

    // Returns if there is any word in the trie
    // that starts with the given prefix.
    public boolean startsWith(String prefix) 
    {
        if(searchNode(prefix) == null) {return false;}
        else{return true;}
    }

    public TrieNode searchNode(String str)
    {
        Map<Character, TrieNode> children = root.children; 
        TrieNode t = null;
        for(int i=0; i<str.length(); i++)
        {
            char c = str.charAt(i);
            if(children.containsKey(c))
            {
                t = children.get(c);
                children = t.children;
            }
            else{return null;}
        }

        prefixRoot = t;
        curPrefix = str;
        words.clear();
        return t;
    }


    ///////////////////////////


  void wordsFinderTraversal(TrieNode node, int offset) 
  {
        //  print(node, offset);

        if(node.isLeaf==true)
        {
          //println("leaf node found");

          TrieNode altair;
          altair = node;

          Stack<String> hstack = new Stack<String>(); 

          while(altair != prefixRoot)
          {
            //println(altair.c);
            hstack.push( Character.toString(altair.c) );
            altair = altair.parent;
          }

          String wrd = curPrefix;

          while(hstack.empty()==false)
          {
            wrd = wrd + hstack.pop();
          }

          //println(wrd);
          words.add(wrd);

        }

         Set<Character> kset = node.children.keySet();
         //println(node.c); println(node.isLeaf);println(kset);
         Iterator itr = kset.iterator();
         ArrayList<Character> aloc = new ArrayList<Character>();

       while(itr.hasNext())
       {
        Character ch = (Character)itr.next();  
        aloc.add(ch);
        //println(ch);
       } 

     // here you can play with the order of the children

       for( int i=0;i<aloc.size();i++)
       {
        wordsFinderTraversal(node.children.get(aloc.get(i)), offset + 2);
       } 

  }


 void displayFoundWords()
 {
   println("_______________");
  for(int i=0;i<words.size();i++)
  {
    println(words.get(i));
  } 
  println("________________");

 }



}//

实施例

Trie prefixTree;

prefixTree = new Trie();  

  prefixTree.insert("GOING");
  prefixTree.insert("GONG");
  prefixTree.insert("PAKISTAN");
  prefixTree.insert("SHANGHAI");
  prefixTree.insert("GONDAL");
  prefixTree.insert("GODAY");
  prefixTree.insert("GODZILLA");

  if( prefixTree.startsWith("GO")==true)
  {
    TrieNode tn = prefixTree.searchNode("GO");
    prefixTree.wordsFinderTraversal(tn,0);
    prefixTree.displayFoundWords(); 

  }

  if( prefixTree.startsWith("GOD")==true)
  {
    TrieNode tn = prefixTree.searchNode("GOD");
    prefixTree.wordsFinderTraversal(tn,0);
    prefixTree.displayFoundWords(); 

  }

答案 1 :(得分:5)

最简单的解决方案是使用depth-first search

你走下来,从输入中逐个字母地匹配。然后,一旦你没有更多的匹配字母,该节点下的所有内容都是你想要的字符串。递归地探索整个子集,在你下到节点时构建字符串。

答案 2 :(得分:1)

在我看来,这更容易递归地解决。它会是这样的:

  1. 编写一个递归函数Print,它打印以您给出的节点为根的trie中的所有节点作为参数。 Wiki告诉您如何执行此操作(查看排序)。
  2. 查找前缀的最后一个字符,以及标记有该字符的节点,从您的trie中的根目录开始。使用此节点作为参数调用Print函数。然后确保你也在每个单词之前输出前缀,因为这将为你提供没有前缀的所有单词。
  3. 如果你真的不关心效率,你可以只用主根节点运行Print,只打印那些以你感兴趣的前缀开头的单词。这更容易实现,但速度较慢

答案 3 :(得分:1)

您需要从找到前缀的节点开始遍历子树。

以相同的方式开始,即找到正确的节点。然后,不是检查它的标记,而是遍历那个树(即遍历它的所有后代; DFS是一个很好的方法),保存用于从第一个节点到达“当前”节点的子字符串。

如果当前节点被标记为单词,则输出*前缀+子串到达。

*或将其添加到列表或其他内容。

答案 4 :(得分:1)

我为ITA个谜题中的一个构建了一个trie

public class WordTree {


class Node {

    private final char ch;

    /**
     * Flag indicates that this node is the end of the string.
     */
    private boolean end;

    private LinkedList<Node> children;

    public Node(char ch) {
        this.ch = ch;
    }

    public void addChild(Node node) {
        if (children == null) {
            children = new LinkedList<Node>();
        }
        children.add(node);
    }

    public Node getNode(char ch) {
        if (children == null) {
            return null;
        }
        for (Node child : children) {
            if (child.getChar() == ch) {
                return child;
            }
        }
        return null;
    }

    public char getChar() {
        return ch;
    }

    public List<Node> getChildren() {
        if (this.children == null) {
            return Collections.emptyList();
        }
        return children;
    }

    public boolean isEnd() {
        return end;
    }

    public void setEnd(boolean end) {
        this.end = end;
    }
}


Node root = new Node(' ');

public WordTree() {
}

/**
 * Searches for a strings that match the prefix.
 *
 * @param prefix - prefix
 * @return - list of strings that match the prefix, or empty list of no matches are found.
 */
public List<String> getWordsForPrefix(String prefix) {
    if (prefix.length() == 0) {
        return Collections.emptyList();
    }
    Node node = getNodeForPrefix(root, prefix);
    if (node == null) {
        return Collections.emptyList();
    }
    List<LinkedList<Character>> chars = collectChars(node);
    List<String> words = new ArrayList<String>(chars.size());
    for (LinkedList<Character> charList : chars) {
        words.add(combine(prefix.substring(0, prefix.length() - 1), charList));
    }
    return words;
}


private String combine(String prefix, List<Character> charList) {
    StringBuilder sb = new StringBuilder(prefix);
    for (Character character : charList) {
        sb.append(character);
    }
    return sb.toString();
}


private Node getNodeForPrefix(Node node, String prefix) {
    if (prefix.length() == 0) {
        return node;
    }
    Node next = node.getNode(prefix.charAt(0));
    if (next == null) {
        return null;
    }
    return getNodeForPrefix(next, prefix.substring(1, prefix.length()));
}


private List<LinkedList<Character>> collectChars(Node node) {
    List<LinkedList<Character>> chars = new ArrayList<LinkedList<Character>>();

    if (node.getChildren().size() == 0) {
        chars.add(new LinkedList<Character>(Collections.singletonList(node.getChar())));
    } else {
        if (node.isEnd()) {
            chars.add(new LinkedList<Character> 
            Collections.singletonList(node.getChar())));
        }
        List<Node> children = node.getChildren();
        for (Node child : children) {
            List<LinkedList<Character>> childList = collectChars(child);
            for (LinkedList<Character> characters : childList) {
                characters.push(node.getChar());
                chars.add(characters);
            }
        }
    }
    return chars;
}


public void addWord(String word) {
    addWord(root, word);
}

private void addWord(Node parent, String word) {
    if (word.trim().length() == 0) {
        return;
    }
    Node child = parent.getNode(word.charAt(0));
    if (child == null) {
        child = new Node(word.charAt(0));
        parent.addChild(child);
    } if (word.length() == 1) {
        child.setEnd(true);
    } else {
        addWord(child, word.substring(1, word.length()));
    }
}


public static void main(String[] args) {
    WordTree tree = new WordTree();
    tree.addWord("world");
    tree.addWord("work");
    tree.addWord("wolf");
    tree.addWord("life");
    tree.addWord("love");
    System.out.println(tree.getWordsForPrefix("wo"));
}

}

答案 5 :(得分:1)

在构建Trie之后,您可以从节点开始执行DFS,其中您找到了前缀:

Here Node is Trie node, word=till now found word, res = list of words

def dfs(self, node, word, res):
    # Base condition: when at leaf node, add current word into our list
    if EndofWord at node: 
        res.append(word)
        return
    # For each level, go deep down, but DFS fashion 
    # add current char into our current word.
    for w in node:
        self.dfs(node[w], word + w, res)

答案 6 :(得分:0)

您需要使用列表
List<String> myList = new ArrayList<String>();
if(matchingStringFound)
myList.add(stringToAdd);

答案 7 :(得分:0)

在for循环之后,添加对printAllStringsInTrie(current,s)的调用;

void printAllStringsInTrie(Node t, String prefix) {
  if (t.current_marker) System.out.println(prefix);
  for (int i = 0; i < t.child.length; i++) {
    if (t.child[i] != null) {
      printAllStringsInTrie(t.child[i], prefix + ('a' + i));  // does + work on (String, char)?
    }
  }
}

答案 8 :(得分:0)

这是C ++中的一个实现

https://github.com/dchavezlive/Basic-Trie

在搜索功能中,您可以让它返回前缀结束位置的节点。如果确定您的节点有一个字段来保存每个子节点(向量?),那么您可以列出前缀结束的节点中的所有子节点。

答案 9 :(得分:0)

简单的递归 DFS 算法可用于查找给定前缀的所有单词。

样本树节点:

static class TrieNode {
    Map<Character, TrieNode> children = new HashMap<>();
    boolean isWord = false;
}

查找给定前缀的所有单词的方法:

static List<String> findAllWordsForPrefix(String prefix, TrieNode root) {
    List<String> words = new ArrayList<>();
    TrieNode current = root;
    for(Character c: prefix.toCharArray()) {
        TrieNode nextNode = current.children.get(c);
        if(nextNode == null) return words;
        current = nextNode;
    }
    if(!current.children.isEmpty()) {
        findAllWordsForPrefixRecursively(prefix, current, words);
    } else {
        if(current.isWord) words.add(prefix);
    }
    return words;
}

static void findAllWordsForPrefixRecursively(String prefix, TrieNode node, List<String> words) {
    if(node.isWord) words.add(prefix);
    if(node.children.isEmpty()) {
        return;
    }
    for(Character c: node.children.keySet()) {
        findAllWordsForPrefixRecursively(prefix + c, node.children.get(c), words);
    }
}

完整的代码可以在下面找到: TrieDataStructure Example

答案 10 :(得分:-1)

以下递归代码可用于TrieNode如下所示: 这段代码工作正常。

TrieNode(char c)
{

        this.con=c;
        this.isEnd=false;
        list=new ArrayList<TrieNode>();
        count=0;

}

//--------------------------------------------------

public void Print(TrieNode root1, ArrayList<Character> path)
{

      if(root1==null)
          return;

      if(root1.isEnd==true)
      {
          //print the entire path
          ListIterator<Character> itr1=path.listIterator();
          while(itr1.hasNext())
          {
              System.out.print(itr1.next());
          }
          System.out.println();
          return;
      }
      else{
          ListIterator<TrieNode> itr=root1.list.listIterator();
          while(itr.hasNext())
          {
              TrieNode child=itr.next();
              path.add(child.con);
              Print(child,path);
              path.remove(path.size()-1);

            }
      }