我试图比较NLTK的Stanford Parser的结果,但我不知道为什么当我与stanford parser进行比较时,我会得到不同的结果 我已经检查过相关问题,但这对我没什么帮助。
stan_dep_parser = StanfordDependencyParser() # stanford parser from NLTK
dependency_parser =stan_dep_parser.raw_parse("Four men died in an accident")
dep = dependency_parser.next()
for triple in dep.triples():
print triple[1],"(",triple[0][0],", ",triple[2][0],")"
当前输出:
nsubj ( died , men )
nummod ( men , Four )
nmod ( died , accident )
case ( accident , in )
det ( accident , an )
根据stanford parser的预期输出:
nummod(men-2, Four-1)
nsubj(died-3, men-2)
root(ROOT-0, died-3)
case(accident-6, in-4)
det(accident-6, an-5)
nmod(died-3, accident-6)
NLTK版本:3.2.4 Stanford Parser:stanford-parser-3.8.0-models
答案 0 :(得分:2)
我自己解决了问题:
我找到了' root'或者' head'句子:
final_dependency = []
sentence = "Four men died in an accident"
dependency_tree = StanfordDependencyParser()
dependency_parser = dependency_tree.raw_parse(sentence)
parsetree = list(dependency_parser)[0]
for k in parsetree.nodes.values():
if k["head"] == 0:
final_dependency.append(str(k["rel"]) + "(" + "Root" + "-"
+ str(k["head"]) + "," + str(k["word"]) + "-" + str(k["address"]) + ")" )
然后我用简单的字符串操作在预期输出中添加带有单词的数字,因为数字是句子中每个单词的索引。