Question

我正在尝试使用NLTK来做一些简单的示例，包括分块和切块。我遇到的问题是，我所有的示例都不起作用，也就是说，解析器输出中没有任何东西被排除。
基本上，我想要的输出是：“票据，港口和移民，布朗巴克” 我想排除其他所有内容。另外，我无法确定chinking RE是否允许包含静态字符串和标签，因为我所看到的所有示例似乎都没有包含此内容。例如，我可以这样做吗？ } SomeString {

我在chinking表达式中尝试了许多不同的RE，但似乎都没有排除任何东西。

这是一些代码。

data = 'Bills on ports and immigration were submitted by Senator Brownback, Republican of Kansas.'

#pre-process the text
words = nltk.tokenize.word_tokenize(data)
pos_tags = nltk.pos_tag(words)
#print (pos_tags)


# Now we want what bill was submitted, and who submitted it. 

chunk = r""" Chunk: {<.*>}
                     }<VBD>+<IN>{
                     }<IN><NNP>\${
        """
chunk_parser = nltk.RegexpParser(chunk)
print (chunk_parser)
chunked_data = chunk_parser.parse(pos_tags)
print (chunked_data)

任何帮助表示赞赏。

谢谢

在NLTK中无法正常工作的问题。

0 个答案: