有没有办法解析下面的PTB树来获取所有子树 例如:
Text : Today is a nice day.
PTB : (3 (2 Today) (3 (3 (2 is) (3 (2 a) (3 (3 nice) (2 day)))) (2 .)))
需要所有子树
Output :
(3 (2 Today) (3 (3 (2 is) (3 (2 a) (3 (3 nice) (2 day)))) (2 .)))
(2 Today)
(3 (3 (2 is) (3 (2 a) (3 (3 nice) (2 day)))) (2 .))
(3 (2 is) (3 (2 a) (3 (3 nice) (2 day))))
(3 (2 is) (3 (2 a) (3 (3 nice) (2 day))))
(2 is)
(3 (2 a) (3 (3 nice) (2 day)))
(2 a)
(3 (3 nice) (2 day))
(3 nice)
(2 day)
(2 .)
答案 0 :(得分:1)
此演示的输入文件应该是每行树的一个字符串表示形式。此示例打印出第一棵树的子树。
Stanford CoreNLP感兴趣的课程是Tree。
import edu.stanford.nlp.trees.*;
import java.io.BufferedReader;
import java.io.FileInputStream;
import java.io.InputStreamReader;
import java.io.*;
public class TreeLoadExample {
public static void printSubTrees(Tree t) {
if (t.isLeaf())
return;
System.out.println(t);
for (Tree subTree : t.children()) {
printSubTrees(subTree);
}
}
public static void main(String[] args) throws IOException, FileNotFoundException,
UnsupportedEncodingException {
TreeFactory tf = new LabeledScoredTreeFactory();
Reader r = new BufferedReader(new InputStreamReader(new FileInputStream(args[0]), "UTF-8"));
TreeReader tr = new PennTreeReader(r, tf);
Tree t = tr.readTree();
printSubTrees(t);
}
}