我需要从给定的文本中获取基本关系。我发现 斯坦福的依赖关系并看了第一个基本的例子:
LexicalizedParser lp = LexicalizedParser.loadModel("edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz");
TreebankLanguagePack tlp = new PennTreebankLanguagePack();
GrammaticalStructureFactory gsf = tlp.grammaticalStructureFactory();
String[] sent = new String[]{"vladimir", "putin", "was", "born", "in", "st.", "petersburg", "and", "he", "was", "not", "born", "in", "berlin", "."};
Tree parse = lp.apply(Sentence.toWordList(sent));
GrammaticalStructure gs = gsf.newGrammaticalStructure(parse);
Collection<TypedDependency> tdl = gs.typedDependencies();
System.out.println(tdl);
结果是:
[nn(putin-2, vladimir-1), nsubjpass(born-4, putin-2), auxpass(born-4, was-3), root(ROOT-0, born-4), prep(born-4, in-5), nn(petersburg-7, st.-6), pobj(in-5, petersburg-7), cc(born-4, and-8), nsubjpass(born-12, he-9), auxpass(born-12, was-10), neg(born-12, not-11), conj(born-4, born-12), prep(born-12, in-13), pobj(in-13, berlin-14)]
更漂亮的输出:
nn(putin-2, vladimir-1)
nsubjpass(born-4, putin-2)
auxpass(born-4, was-3)
root(ROOT-0, born-4)
prep(born-4, in-5)
nn(petersburg-7, st.-6)
pobj(in-5, petersburg-7)
cc(born-4, and-8)
nsubjpass(born-12, he-9)
auxpass(born-12, was-10)
neg(born-12, not-11)
conj(born-4, born-12)
prep(born-12, in-13)
pobj(in-13, berlin-14)
我现在的问题是:是否已有解析关系的解析器? 例如,我想得到这样的关系:&#34;出生在&#34;介于&#34; vladimir putin&#34;和&#34; st。堡&#34 ;, 所以我需要以下依赖项:
nn(putin-2, vladimir-1)
nsubjpass(born-4, putin-2)
auxpass(born-4, was-3)
prep(born-4, in-5)
nn(petersburg-7, st.-6)
pobj(in-5, petersburg-7)
所以我有我需要的所有信息。我可以编写一个返回我的解析器 政府名词,关系和依赖名词但是如果已经存在的话 解析器我不需要自己编写。
那还有吗?
答案 0 :(得分:0)
语义角色标签可能很有用: http://cogcomp.cs.illinois.edu/demo/srl/
找到文本跨度与其触发器(例如动词)之间的关系。
答案 1 :(得分:0)
AFAIK,使用
可以实现提取关系的问题Tregex对选区解析输出(斯坦福依赖解析器据说遵循这条路线)
使用依赖性解析器输出,例如stanfrod依赖性解析器。至于解析输出,您可能会发现Chris的响应。曼宁很有用 Stanford Parser - Traversing the typed dependencies graph
使用Daniel在他的回答中指出的语义角色贴标机。
我建议按顺序尝试选项2,1,3。