SemgrexPattern引理属性似乎不起作用

时间:2016-08-03 14:36:14

标签: java nlp pattern-matching stanford-nlp

以下是使用Stanford NLP的SemgrexPattern的一个非常简单的示例。 我不明白为什么找不到与{lemma:/eat/}匹配的{word:/eats/}时找不到任何匹配项。我使用LemmaAnnotation类来获得动词“吃”的引理,它就是“吃”。

感谢您的帮助:)

package Project;
import java.io.File;
import java.util.Scanner;

import edu.stanford.nlp.parser.lexparser.TreebankLangParserParams;
import edu.stanford.nlp.parser.lexparser.EnglishTreebankParserParams;
import edu.stanford.nlp.semgraph.SemanticGraph;
import edu.stanford.nlp.semgraph.SemanticGraphFactory;
import edu.stanford.nlp.semgraph.semgrex.SemgrexMatcher;
import edu.stanford.nlp.semgraph.semgrex.SemgrexPattern;
import edu.stanford.nlp.trees.GrammaticalStructure;
import edu.stanford.nlp.trees.GrammaticalStructureFactory;
import edu.stanford.nlp.trees.Tree;

public class SemgrexDemo {
  public static void main(String[] args) throws FileNotFoundException {
    String treeString = "(ROOT (S (NP (NNP John)) (VP (VBZ eats) (NP (NN pizza))) (. .)))";
    Tree tree = Tree.valueOf(treeString);
    SemanticGraph graph = SemanticGraphFactory.generateUncollapsedDependencies(tree);
    TreebankLangParserParams params = new EnglishTreebankParserParams();
    GrammaticalStructureFactory gsf = params.treebankLanguagePack().grammaticalStructureFactory(params.treebankLanguagePack().punctuationWordRejectFilter(), params.typedDependencyHeadFinder());
    GrammaticalStructure gs = gsf.newGrammaticalStructure(tree);
    System.err.println(graph);
    SemgrexPattern semgrex = SemgrexPattern.compile("{}=A <<dobj=reln {lemma:/eat/}=B");
    SemgrexMatcher matcher = semgrex.matcher(graph);
    while (matcher.find()) {
      System.err.println(matcher.getNode("A") + " <<dobj " + matcher.getNode("B"));
   }
  }
}

1 个答案:

答案 0 :(得分:1)

当您将树字符串解析为Tree对象时,lemmata不会自动添加到标记中,因此SemanticGraph中所有节点的引理属性为null,因此{{1} }与任何节点都不匹配。

您可以使用{lemma:/eat/}类的lemma(String word, String pos)方法添加引理:

Morphology