我想提取"链接的依赖关系"我称之为。
我能够使用Stanford Parser和Java创建依赖树。我的句子是:"此外,重要的电子邮件必须由托马斯移动和标记。"
结果如下:
Dependency Graph: root(ROOT-0, have-5)
advmod(have-5, Also-1)
punct(have-5, ,-2)
amod(e-mails-4, important-3)
nsubj(have-5, e-mails-4)
nsubjpass(moved-8, e-mails-4)
mark(moved-8, to-6)
auxpass(moved-8, be-7)
xcomp(have-5, moved-8)
cc(moved-8, and-9)
xcomp(have-5, marked-10)
conj:and(moved-8, marked-10)
case(Thomas-12, by-11)
nmod:by(marked-10, Thomas-12)
punct(have-5, .-13)
如您所见,有索引1-12。通过一些循环和If-Statements,我能够提取例如:" Thomas标记了电子邮件"。
代码:
import edu.stanford.nlp.ling.CoreAnnotations;
import edu.stanford.nlp.pipeline.Annotation;
import edu.stanford.nlp.pipeline.StanfordCoreNLP;
import edu.stanford.nlp.semgraph.SemanticGraph;
import edu.stanford.nlp.semgraph.SemanticGraphCoreAnnotations;
import edu.stanford.nlp.util.CoreMap;
import java.util.Properties;
public static void main(String[] args) {
Properties props = new Properties();
props.put("annotators", "tokenize,ssplit,pos,parse,depparse");
props.put("tokenize.options", "ptb3Escaping=false");
props.put("parse.maxlen", "10000");
props.put("depparse.extradependencies", "SUBJ_ONLY");
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
String str = "Also, important e-mails have to be moved and marked by Thomas.";
Annotation document = new Annotation(str);
pipeline.annotate(document);
CoreMap sentence = document.get(CoreAnnotations.SentencesAnnotation.class).get(0);
SemanticGraph dependency_graph = sentence.get(SemanticGraphCoreAnnotations.CollapsedCCProcessedDependenciesAnnotation.class);
System.err.println("\n\nDependency Graph: " + dependency_graph.toString(SemanticGraph.OutputFormat.LIST));
}
但我也想提取#34; Thomas搬了电子邮件"。很容易在依赖树" conj:和"中看到。有没有一种有效的方法来进行这种提取?我正在构建循环和循环,这不是解决方案。
这是我目前的循环 - 如果是灾难:
for (int i = 0; i < tdl.size(); i++) {
//Ausgehend vom Subjekt
if(tdl.get(i).reln().toString().contains("nsubj")){
String subjekt = "";
//Abfangen wenn Person = Subjekt
if(tdl.get(i).dep().toString().contains("PRP")){
subjekt = "PERSON";
} else {
subjekt = tdl.get(i).dep().word();
}
String praedikat = tdl.get(i).gov().word();
System.out.println(subjekt + praedikat);
for (int j = 0; j < tdl.size(); j++) {
//Ausgehend vom Subjekt
if((tdl.get(j).reln().toString() == "dobj") && (tdl.get(i).gov().index() == tdl.get(j).gov().index())){
//Suche Objekt
System.out.println(praedikat + "(" + subjekt + "," + tdl.get(j).dep().word() + ")");
} else if((tdl.get(j).reln().toString().contains("conj") && (tdl.get(i).gov().index() == tdl.get(j).dep().index())) ) {
for (int k = 0; k < tdl.size(); k++) {
if((tdl.get(k).reln().toString() == "dobj") && (tdl.get(k).gov().index() == tdl.get(j).gov().index())){
System.out.println(praedikat + "(" + subjekt + "," + tdl.get(k).dep().word() + ")");
}
}
} else if((tdl.get(j).reln().toString().contains("nmod:agent") && (tdl.get(j).gov().index() == tdl.get(i).gov().index())) ) {
System.out.println(praedikat + "(" + subjekt + "," + tdl.get(j).dep().word() + ")");
}
}
}