Question

我的主要问题是我不知道如何从GrammaticalStructure中提取节点。我在java netbeans中使用englishPCFG.ser。我的目标是知道屏幕的质量，如：

iphone 4的屏幕很棒。

我想要提取屏幕并且很棒。如何提取NN（屏幕）和VP（很棒）？

我写的代码是：

LexicalizedParser lp = new LexicalizedParser("C:\\englishPCFG.ser");
lp.setOptionFlags(new String[]{"-maxLength", "80", "-retainTmpSubcategories"});

String sent ="the screen is very good.";
Tree parse = (Tree) lp.apply(Arrays.asList(sent));
parse.pennPrint();
System.out.println();

TreebankLanguagePack tlp = new PennTreebankLanguagePack();
GrammaticalStructureFactory gsf = tlp.grammaticalStructureFactory();
GrammaticalStructure gs = gsf.newGrammaticalStructure(parse);
Collection tdl = gs.typedDependenciesCollapsed();

Answer 1

集合tdl是类型依赖项的列表。对于这句话，它包含：

det(screen-2, the-1)
nsubj(great-7, screen-2)
amod(4-5, iphone-4)
prep_of(screen-2, 4-5)
cop(great-7, is-6)

（通过尝试online可以看到）。

因此，您想要的依赖项nsubj(great-7, screen-2)就在该列表中。 nsubj表示“屏幕”是“伟大”的主题。

依赖项集合只是一个集合（List）。为了进行更复杂的进一步处理，人们通常希望将依赖关系变为可以进行各种搜索和遍历的图结构。有各种方法可以做到这一点。我们经常使用（jgrapht）[http://www.jgrapht.org/]库。但那就是你自己编写的代码。

如何从Stanford Parser NLP获得有用的节点？

1 个答案: