解析斯坦福大学的依赖关系

时间:2014-06-18 06:28:25

标签: java nlp stanford-nlp

我需要从给定的文本中获取基本关系。我发现 斯坦福的依赖关系并看了第一个基本的例子:

LexicalizedParser lp = LexicalizedParser.loadModel("edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz");

TreebankLanguagePack tlp = new PennTreebankLanguagePack();
GrammaticalStructureFactory gsf = tlp.grammaticalStructureFactory();

String[] sent = new String[]{"vladimir", "putin", "was", "born", "in", "st.", "petersburg", "and", "he", "was", "not", "born", "in", "berlin", "."};
Tree parse = lp.apply(Sentence.toWordList(sent));
GrammaticalStructure gs = gsf.newGrammaticalStructure(parse);
Collection<TypedDependency> tdl = gs.typedDependencies();

System.out.println(tdl);

结果是:

[nn(putin-2, vladimir-1), nsubjpass(born-4, putin-2), auxpass(born-4, was-3), root(ROOT-0, born-4), prep(born-4, in-5), nn(petersburg-7, st.-6), pobj(in-5, petersburg-7), cc(born-4, and-8), nsubjpass(born-12, he-9), auxpass(born-12, was-10), neg(born-12, not-11), conj(born-4, born-12), prep(born-12, in-13), pobj(in-13, berlin-14)]

更漂亮的输出:

nn(putin-2, vladimir-1)
nsubjpass(born-4, putin-2)
auxpass(born-4, was-3)
root(ROOT-0, born-4)
prep(born-4, in-5)
nn(petersburg-7, st.-6)
pobj(in-5, petersburg-7)
cc(born-4, and-8)
nsubjpass(born-12, he-9)
auxpass(born-12, was-10)
neg(born-12, not-11)
conj(born-4, born-12)
prep(born-12, in-13)
pobj(in-13, berlin-14)

我现在的问题是:是否已有解析关系的解析器? 例如,我想得到这样的关系:&#34;出生在&#34;介于&#34; vladimir putin&#34;和&#34; st。堡&#34 ;, 所以我需要以下依赖项:

nn(putin-2, vladimir-1)
nsubjpass(born-4, putin-2)
auxpass(born-4, was-3)
prep(born-4, in-5)
nn(petersburg-7, st.-6)
pobj(in-5, petersburg-7)

所以我有我需要的所有信息。我可以编写一个返回我的解析器 政府名词,关系和依赖名词但是如果已经存在的话 解析器我不需要自己编写。

那还有吗?

2 个答案:

答案 0 :(得分:0)

语义角色标签可能很有用: http://cogcomp.cs.illinois.edu/demo/srl/

找到文本跨度与其触发器(例如动词)之间的关系。

答案 1 :(得分:0)

AFAIK,使用

可以实现提取关系的问题
  1. Tregex对选区解析输出(斯坦福依赖解析器据说遵循这条路线)

  2. 使用依赖性解析器输出,例如stanfrod依赖性解析器。至于解析输出,您可能会发现Chris的响应。曼宁很有用 Stanford Parser - Traversing the typed dependencies graph

  3. 使用Daniel在他的回答中指出的语义角色贴标机。

  4. 我建议按顺序尝试选项2,1,3。