nltk stanford依赖解析器抛出断言错误

时间:2016-11-18 05:40:32

标签: python nltk stanford-nlp dependency-parsing

我使用python3和nltk与stanford依赖解析器来解析句子列表。然后用句子收集所有节点信息。以下是我的代码,它在python3和一个名为.python的virtualenv环境中执行:

from nltk.parse.stanford import StanfordDependencyParser
parser = StanfordDependencyParser('stanford-parser-full-2015-12-09/stanford-parser.jar', 'stanford-parser-full-2015-12-09/stanford-parser-3.6.0-models.jar');
graph_nodes = sum([[dep_graph.nodes for dep_graph in dep_graphs] for dep_graphs in parser.raw_parse_sents(sentences)], []);

我发现stanford依赖解析器在某些句子中不断抛出断言错误。这是我得到的错误:

    graph_nodes = sum([[dep_graph.nodes for dep_graph in dep_graphs] for dep_graphs in self.parser.raw_parse_sents(sentences)], []);
    File "/Users/user/sent_code/.python/lib/python3.5/site-packages/nltk/parse/stanford.py", line 150, in raw_parse_sents
return self._parse_trees_output(self._execute(cmd, '\n'.join(sentences), verbose))
    File "/Users/user/sent_code/.python/lib/python3.5/site-packages/nltk/parse/stanford.py", line 91, in _parse_trees_output
res.append(iter([self._make_tree('\n'.join(cur_lines))]))
    File "/Users/user/sent_code/.python/lib/python3.5/site-packages/nltk/parse/stanford.py", line 339, in _make_tree
return DependencyGraph(result, top_relation_label='root')
    File "/Users/user/sent_code/.python/lib/python3.5/site-packages/nltk/parse/dependencygraph.py", line 84, in __init__
top_relation_label=top_relation_label,
    File "/Users/user/sent_code/.python/lib/python3.5/site-packages/nltk/parse/dependencygraph.py", line 328, in _parse
assert cell_number == len(cells)
AssertionError

然后我发现导致此错误的句子。它是:

'for all of its insights into the dream world of teen life , and its electronic expression through cyber culture , the film gives no quarter to anyone seeking to pull a cohesive story out of its 2 1/2-hour running time . \n'

我多次更改句子以查看触发断言错误的原因。似乎当我从中删除“/”时,可以解析句子。当我在其中包含“/”时,抛出断言错误。

我想知道是否有特殊符号导致问题。我回到nltk的源代码来检查导致此断言错误的原因(在网站中搜索“断言”:http://www.nltk.org/_modules/nltk/parse/dependencygraph.html)但无法弄清楚导致错误的原因。

有人可以解释为什么会抛出错误以及如何解决这个问题?

0 个答案:

没有答案