从依赖树中提取对(Stanford Parser)

时间:2015-09-22 12:54:47

标签: java parsing tree dependencies

我想提取"链接的依赖关系"我称之为。

我能够使用Stanford Parser和Java创建依赖树。我的句子是:"此外,重要的电子邮件必须由托马斯移动和标记。"

结果如下:

Dependency Graph: root(ROOT-0, have-5)
advmod(have-5, Also-1)
punct(have-5, ,-2)
amod(e-mails-4, important-3)
nsubj(have-5, e-mails-4)
nsubjpass(moved-8, e-mails-4)
mark(moved-8, to-6)
auxpass(moved-8, be-7)
xcomp(have-5, moved-8)
cc(moved-8, and-9)
xcomp(have-5, marked-10)
conj:and(moved-8, marked-10)
case(Thomas-12, by-11)
nmod:by(marked-10, Thomas-12)
punct(have-5, .-13)

如您所见,有索引1-12。通过一些循环和If-Statements,我能够提取例如:" Thomas标记了电子邮件"。

代码:

  import edu.stanford.nlp.ling.CoreAnnotations;
   import edu.stanford.nlp.pipeline.Annotation;
   import edu.stanford.nlp.pipeline.StanfordCoreNLP;
   import edu.stanford.nlp.semgraph.SemanticGraph;
   import edu.stanford.nlp.semgraph.SemanticGraphCoreAnnotations;
   import edu.stanford.nlp.util.CoreMap;
   import java.util.Properties;
  public static void main(String[] args) {
     Properties props = new Properties();
     props.put("annotators", "tokenize,ssplit,pos,parse,depparse");
     props.put("tokenize.options", "ptb3Escaping=false");
     props.put("parse.maxlen", "10000");
     props.put("depparse.extradependencies", "SUBJ_ONLY");
     StanfordCoreNLP pipeline = new StanfordCoreNLP(props);

     String str = "Also, important e-mails have to be moved and marked by Thomas.";
     Annotation document = new Annotation(str);
     pipeline.annotate(document);
     CoreMap sentence =     document.get(CoreAnnotations.SentencesAnnotation.class).get(0);
 SemanticGraph dependency_graph = sentence.get(SemanticGraphCoreAnnotations.CollapsedCCProcessedDependenciesAnnotation.class);
 System.err.println("\n\nDependency Graph: " + dependency_graph.toString(SemanticGraph.OutputFormat.LIST));

}

但我也想提取#34; Thomas搬了电子邮件"。很容易在依赖树" conj:和"中看到。有没有一种有效的方法来进行这种提取?我正在构建循环和循环,这不是解决方案。

这是我目前的循环 - 如果是灾难:

for (int i = 0; i < tdl.size(); i++) {

      //Ausgehend vom Subjekt
      if(tdl.get(i).reln().toString().contains("nsubj")){

          String subjekt = "";

          //Abfangen wenn Person = Subjekt
          if(tdl.get(i).dep().toString().contains("PRP")){
              subjekt = "PERSON";
          } else {
              subjekt = tdl.get(i).dep().word();
          }

          String praedikat = tdl.get(i).gov().word();

          System.out.println(subjekt + praedikat);

          for (int j = 0; j < tdl.size(); j++) {

              //Ausgehend vom Subjekt
              if((tdl.get(j).reln().toString() == "dobj") && (tdl.get(i).gov().index() == tdl.get(j).gov().index())){

                  //Suche Objekt
                  System.out.println(praedikat + "(" + subjekt + "," + tdl.get(j).dep().word() + ")");

              } else if((tdl.get(j).reln().toString().contains("conj") && (tdl.get(i).gov().index() == tdl.get(j).dep().index()))  ) {

                  for (int k = 0; k < tdl.size(); k++) {

                      if((tdl.get(k).reln().toString() == "dobj") && (tdl.get(k).gov().index() == tdl.get(j).gov().index())){

                          System.out.println(praedikat + "(" + subjekt + "," + tdl.get(k).dep().word() + ")");

                      }
                  }

              } else if((tdl.get(j).reln().toString().contains("nmod:agent") && (tdl.get(j).gov().index() == tdl.get(i).gov().index()))  ) {

                            System.out.println(praedikat + "(" + subjekt + "," + tdl.get(j).dep().word() + ")");

              }

          }
      }

0 个答案:

没有答案