解析节点标签以便在句法树中提取子句

时间:2017-09-14 14:42:24

标签: java stanford-nlp

作为一个例子,我有来自斯坦福分析器的以下解析树。如何提取S和SBAR等标签以最终提取子句。我尝试了一小段代码(显然是不正确的)作为起点,使用不同的Tree方法,但没有给我预期的结果。

代码:

for (Tree subtree: parseTree.getLeaves()){
            if (subtree.label().equals("S")||subtree.label().equals("SBAR"))
                System.out.println("SUBTREE:::"+"\t"+ subtree.getLeaves());
        }

解析树:

(ROOT
      (S
        (NP
          (NP (DT A) (NNP Bristol) (NN hospital))
          (SBAR
            (WHNP (WDT that))
            (S
              (VP (VBD retained)
                (NP
                  (NP (DT the) (NNS hearts))
                  (PP (IN of)
                    (NP
                      (NP (CD 300) (NNS children))
                      (SBAR
                        (WHNP (WP who))
                        (S
                          (VP (VBD died)
                            (PP (IN in)
                              (NP (JJ complex) (NNS operations)))))))))))))
        (VP (VBD behaved)
          (ADVP (IN in) (DT a))
          ('' '')
          (S
            (VP (VBG cavalier) ('' '')
              (NP (NN fashion))))
          (PP (IN towards)
            (NP (DT the) (NNS parents))))
        (. .)))

2 个答案:

答案 0 :(得分:0)

以下是浏览树并找到def another_view(request): data = {} data['key'] = request.session.pop('key', "NOT_FOUND") # this will prevent from raising exception data['another_key'] = request.session.pop('another_key', "NOT_FOUND") ... return render('/your/template.html', data) S的示例代码:

SBAR

答案 1 :(得分:0)

另一种方法是使用Tregex。以下是一些示例代码:

package edu.stanford.nlp.examples;

import edu.stanford.nlp.ling.*;
import edu.stanford.nlp.pipeline.*;
import edu.stanford.nlp.trees.*;
import edu.stanford.nlp.trees.tregex.*;
import edu.stanford.nlp.util.*;

import java.util.*;

public class TregexUsageExample {

  public static void main(String[] args) {
    // set up pipeline
    Properties props = new Properties();
    props.setProperty("annotators", "tokenize,ssplit,pos,lemma,ner,parse");
    StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
    // Spanish example
    Annotation annotation =
        new Annotation(
            "A Bristol hospital that retained the hearts of 300 children who died in " +
                "complex operations behaved in a \"cavalier fashion\" towards the parents");
    pipeline.annotate(annotation);
    // get first sentence
    CoreMap firstSentence = annotation.get(CoreAnnotations.SentencesAnnotation.class).get(0);
    Tree firstSentenceTree = firstSentence.get(TreeCoreAnnotations.TreeAnnotation.class);
    // use Tregex to match
    String SorSBARPattern = "/SBAR|^S$/";
    TregexPattern SorSBARTregexPattern = TregexPattern.compile(SorSBARPattern);
    TregexMatcher SorSBARTregexMatcher = SorSBARTregexPattern.matcher(firstSentenceTree);
    while (SorSBARTregexMatcher.find()) {
      SorSBARTregexMatcher.getMatch().pennPrint();
    }
  }
}