作为一个例子,我有来自斯坦福分析器的以下解析树。如何提取S和SBAR等标签以最终提取子句。我尝试了一小段代码(显然是不正确的)作为起点,使用不同的Tree方法,但没有给我预期的结果。
代码:
for (Tree subtree: parseTree.getLeaves()){
if (subtree.label().equals("S")||subtree.label().equals("SBAR"))
System.out.println("SUBTREE:::"+"\t"+ subtree.getLeaves());
}
解析树:
(ROOT
(S
(NP
(NP (DT A) (NNP Bristol) (NN hospital))
(SBAR
(WHNP (WDT that))
(S
(VP (VBD retained)
(NP
(NP (DT the) (NNS hearts))
(PP (IN of)
(NP
(NP (CD 300) (NNS children))
(SBAR
(WHNP (WP who))
(S
(VP (VBD died)
(PP (IN in)
(NP (JJ complex) (NNS operations)))))))))))))
(VP (VBD behaved)
(ADVP (IN in) (DT a))
('' '')
(S
(VP (VBG cavalier) ('' '')
(NP (NN fashion))))
(PP (IN towards)
(NP (DT the) (NNS parents))))
(. .)))
答案 0 :(得分:0)
以下是浏览树并找到def another_view(request):
data = {}
data['key'] = request.session.pop('key', "NOT_FOUND") # this will prevent from raising exception
data['another_key'] = request.session.pop('another_key', "NOT_FOUND")
...
return render('/your/template.html', data)
和S
的示例代码:
SBAR
答案 1 :(得分:0)
另一种方法是使用Tregex。以下是一些示例代码:
package edu.stanford.nlp.examples;
import edu.stanford.nlp.ling.*;
import edu.stanford.nlp.pipeline.*;
import edu.stanford.nlp.trees.*;
import edu.stanford.nlp.trees.tregex.*;
import edu.stanford.nlp.util.*;
import java.util.*;
public class TregexUsageExample {
public static void main(String[] args) {
// set up pipeline
Properties props = new Properties();
props.setProperty("annotators", "tokenize,ssplit,pos,lemma,ner,parse");
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
// Spanish example
Annotation annotation =
new Annotation(
"A Bristol hospital that retained the hearts of 300 children who died in " +
"complex operations behaved in a \"cavalier fashion\" towards the parents");
pipeline.annotate(annotation);
// get first sentence
CoreMap firstSentence = annotation.get(CoreAnnotations.SentencesAnnotation.class).get(0);
Tree firstSentenceTree = firstSentence.get(TreeCoreAnnotations.TreeAnnotation.class);
// use Tregex to match
String SorSBARPattern = "/SBAR|^S$/";
TregexPattern SorSBARTregexPattern = TregexPattern.compile(SorSBARPattern);
TregexMatcher SorSBARTregexMatcher = SorSBARTregexPattern.matcher(firstSentenceTree);
while (SorSBARTregexMatcher.find()) {
SorSBARTregexMatcher.getMatch().pennPrint();
}
}
}