我想从给定文本行打印解析树和通用依赖项,如http://nlp.stanford.edu:8080/parser/index.jsp
中的演示所示这是我的代码
public class ParseDoc {
private final static String PCG_MODEL = "edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz";
private final TokenizerFactory<CoreLabel> tokenizerFactory = PTBTokenizer.factory(new CoreLabelTokenFactory(), "invertible=true");
private static final LexicalizedParser parser = LexicalizedParser.loadModel(PCG_MODEL);
public Tree parse(String str) {
List<CoreLabel> tokens = tokenize(str);
Tree tree = parser.apply(tokens);
return tree;
}
private List<CoreLabel> tokenize(String str) {
Tokenizer<CoreLabel> tokenizer =
tokenizerFactory.getTokenizer(
new StringReader(str));
return tokenizer.tokenize();
}
public static void main(String[] args) {
String str = "My dog also likes eating sausage.";
// Parser parser = new Parser();
Tree tree = parser.parse(str);
List<Tree> leaves = tree.getLeaves();
// Print words and Pos Tags
for (Tree leaf : leaves) {
Tree parent = leaf.parent(tree);
System.out.print(leaf.label().value() + "-" + parent.label().value() + " ");
}
System.out.println();
}
}
不幸的是我只能得到标记
My-PRP$ dog-NN also-RB likes-VBZ eating-VBG sausage-NN .-.
对我没有任何用处。
我想打印树:
(ROOT
(S
(NP (PRP$ My) (NN dog))
(ADVP (RB also))
(VP (VBZ likes)
(S
(VP (VBG eating)
(NP (NN sausage)))))
(. .)))
和通用依赖项:
nmod:poss(dog-2, My-1)
nsubj(likes-4, dog-2)
advmod(likes-4, also-3)
root(ROOT-0, likes-4)
xcomp(likes-4, eating-5)
dobj(eating-5, sausage-6)
我怎样才能做到这一点?
答案 0 :(得分:2)
以下是一些示例代码:
package edu.stanford.nlp.examples;
import edu.stanford.nlp.pipeline.*;
import edu.stanford.nlp.ling.*;
import edu.stanford.nlp.trees.*;
import edu.stanford.nlp.semgraph.*;
import edu.stanford.nlp.util.*;
import java.util.*;
public class PrintParse {
public static void main(String[] args) {
Annotation document =
new Annotation("My dog also likes eating sausage.");
Properties props = new Properties();
props.setProperty("annotators", "tokenize,ssplit,pos,lemma,ner,parse");
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
pipeline.annotate(document);
for (CoreMap sentence : document.get(CoreAnnotations.SentencesAnnotation.class)) {
Tree constituencyParse = sentence.get(TreeCoreAnnotations.TreeAnnotation.class);
System.out.println(constituencyParse);
SemanticGraph dependencyParse =
sentence.get(SemanticGraphCoreAnnotations.BasicDependenciesAnnotation.class);
System.out.println(dependencyParse.toList());
}
}
}