Question

我正在尝试使用NLTK工具包提取主题对象动词组合。到目前为止这是我的代码。我怎么能这样做？

import nltk
from nltk.tokenize import sent_tokenize, word_tokenize

grammar = r"""
  NP:
    {<.*>+}          # Chunk everything
    }<VBD|VBZ|VBP|IN>+{      # Chink sequences of VBD and IN
  """
cp = nltk.RegexpParser(grammar)
s = "This song is the best song in the world. I really love it."
for t in sent_tokenize(s):
    text = nltk.pos_tag(word_tokenize(t))
    print cp.parse(text)

Answer 1

您可以尝试的一种方法是将NP（名词短语）和VP（动词短语）中的句子分块，然后在此基础上构建RBS（基于规则的系统）以建立块角色。例如，如果VP在ActiveVoice中，则Subject应该是VP前面的块。如果它在PassiveVoice中，它应该是以下NP。

您还可以查看Pattern.en。解析器包含关联提取：http://www.clips.ua.ac.be/pages/pattern-en#parser

使用NLTK RegexpParser查找主语，宾语，动词组合

1 个答案: