Question

有许多与MaltParser和/或NLTK相关的问题：

现在，在NLTK中有一个更稳定的MaltParser API版本：https://github.com/nltk/nltk/pull/944但是在解析多个句子的同时存在问题。

一次解析一句似乎很好：

// declare your reactive data source once to reuse the same in multiple places
function messagesCursor(){
  return Messages.find();
}

Template.myTemplate.helpers({
  messages: messagesCursor
});

Template.myTemplate.onRendered(function(){
  this.autorun(function(){
    // we need to register a dependency on the number of documents returned by the
    // cursor to actually make this computation rerun everytime the count is altered
    var messagesCount = messagesCursor().count();
    //
    Tracker.afterFlush(function(){
      // assert that every messages have been rendered
      console.log(this.$(".messages") == messagesCount);
    }.bind(this));
  }.bind(this));
});

但是解析一个句子列表并没有返回DependencyGraph对象：

_path_to_maltparser = '/home/alvas/maltparser-1.8/dist/maltparser-1.8/'
_path_to_model= '/home/alvas/engmalt.linear-1.7.mco'     
>>> mp = MaltParser(path_to_maltparser=_path_to_maltparser, model=_path_to_model)
>>> sent = 'I shot an elephant in my pajamas'.split()
>>> sent2 = 'Time flies like banana'.split()
>>> print(mp.parse_one(sent).tree())
(pajamas (shot I) an elephant in my)

为什么使用_path_to_maltparser = '/home/alvas/maltparser-1.8/dist/maltparser-1.8/' _path_to_model= '/home/alvas/engmalt.linear-1.7.mco' >>> mp = MaltParser(path_to_maltparser=_path_to_maltparser, model=_path_to_model) >>> sent = 'I shot an elephant in my pajamas'.split() >>> sent2 = 'Time flies like banana'.split() >>> print(mp.parse_one(sent).tree()) (pajamas (shot I) an elephant in my) >>> print(next(mp.parse_sents([sent,sent2]))) <listiterator object at 0x7f0a2e4d3d90> >>> print(next(next(mp.parse_sents([sent,sent2])))) [{u'address': 0, u'ctag': u'TOP', u'deps': [2], u'feats': None, u'lemma': None, u'rel': u'TOP', u'tag': u'TOP', u'word': None}, {u'address': 1, u'ctag': u'NN', u'deps': [], u'feats': u'_', u'head': 2, u'lemma': u'_', u'rel': u'nn', u'tag': u'NN', u'word': u'I'}, {u'address': 2, u'ctag': u'NN', u'deps': [1, 11], u'feats': u'_', u'head': 0, u'lemma': u'_', u'rel': u'null', u'tag': u'NN', u'word': u'shot'}, {u'address': 3, u'ctag': u'AT', u'deps': [], u'feats': u'_', u'head': 11, u'lemma': u'_', u'rel': u'nn', u'tag': u'AT', u'word': u'an'}, {u'address': 4, u'ctag': u'NN', u'deps': [], u'feats': u'_', u'head': 11, u'lemma': u'_', u'rel': u'nn', u'tag': u'NN', u'word': u'elephant'}, {u'address': 5, u'ctag': u'NN', u'deps': [], u'feats': u'_', u'head': 11, u'lemma': u'_', u'rel': u'nn', u'tag': u'NN', u'word': u'in'}, {u'address': 6, u'ctag': u'NN', u'deps': [], u'feats': u'_', u'head': 11, u'lemma': u'_', u'rel': u'nn', u'tag': u'NN', u'word': u'my'}, {u'address': 7, u'ctag': u'NNS', u'deps': [], u'feats': u'_', u'head': 11, u'lemma': u'_', u'rel': u'nn', u'tag': u'NNS', u'word': u'pajamas'}, {u'address': 8, u'ctag': u'NN', u'deps': [], u'feats': u'_', u'head': 11, u'lemma': u'_', u'rel': u'nn', u'tag': u'NN', u'word': u'Time'}, {u'address': 9, u'ctag': u'NNS', u'deps': [], u'feats': u'_', u'head': 11, u'lemma': u'_', u'rel': u'nn', u'tag': u'NNS', u'word': u'flies'}, {u'address': 10, u'ctag': u'NN', u'deps': [], u'feats': u'_', u'head': 11, u'lemma': u'_', u'rel': u'nn', u'tag': u'NN', u'word': u'like'}, {u'address': 11, u'ctag': u'NN', u'deps': [3, 4, 5, 6, 7, 8, 9, 10], u'feats': u'_', u'head': 2, u'lemma': u'_', u'rel': u'dep', u'tag': u'NN', u'word': u'banana'}]不会返回parse_sents()的可迭代内容？

然而，我可以懒得做：

parse_one

但这不是我正在寻找的解决方案。我的问题是如何解答为什么_path_to_maltparser = '/home/alvas/maltparser-1.8/dist/maltparser-1.8/' _path_to_model= '/home/alvas/engmalt.linear-1.7.mco' >>> mp = MaltParser(path_to_maltparser=_path_to_maltparser, model=_path_to_model) >>> sent1 = 'I shot an elephant in my pajamas'.split() >>> sent2 = 'Time flies like banana'.split() >>> sentences = [sent1, sent2] >>> for sent in sentences: >>> ... print(mp.parse_one(sent).tree())不会返回parse_sent()的可迭代内容。怎么能在NLTK代码中修复？

在@NikitaAstrakhantsev回答之后，我现在尝试输出一个解析树，但它似乎很困惑，并且在解析之前将两个句子放在一起。

parse_one()

[OUT]：

# Initialize a MaltParser object with a pre-trained model.
mp = MaltParser(path_to_maltparser=path_to_maltparser, model=path_to_model) 
sent = 'I shot an elephant in my pajamas'.split()
sent2 = 'Time flies like banana'.split()
# Parse a single sentence.
print(mp.parse_one(sent).tree())
print(next(next(mp.parse_sents([sent,sent2]))).tree())

从代码中看起来似乎做了一些奇怪的事情：https://github.com/nltk/nltk/blob/develop/nltk/parse/api.py#L45

为什么NLTK中的解析器抽象类在解析之前将两个句子拼凑成一个？我是否错误地呼叫了(pajamas (shot I) an elephant in my) (shot I (banana an elephant in my pajamas Time flies like))？如果是这样，调用parse_sents()的正确方法是什么？

Answer 1

正如我在您的代码示例中看到的那样，您不能在此行中致电tree()

>>> print(next(next(mp.parse_sents([sent,sent2]))))

虽然您在所有情况下都使用tree()致电parse_one()。

否则，我不明白为什么会发生这种情况：parse_one() ParserI MaltParser的方法未在parse_sents()中覆盖，而其所做的一切只是调用{ {1}}的{1}}，请参阅the code。

更新： The line you're talking about未被调用，因为MaltParser会在parse_sents()中被覆盖并被直接调用。

我现在唯一的猜测是java lib maltparser对包含几个句子的输入文件（我的意思是this block - 运行java）没有正常工作。也许原始麦芽解析器已经改变了格式，现在它不是MaltParser。不幸的是，我不能自己运行此代码，因为maltparser.org第二天就会停止运行。我检查了输入文件是否具有预期的格式（句子由双端线分隔），因此python包装器合并句子的可能性很小。

使用NLTK使用MaltParser解析多个句子

1 个答案: