对于Chunking

时间:2019-05-17 19:29:47

标签: python nlp nltk

我试图通过NLTK提出基本的分块概念,但最终还是会出现意想不到的意图错误

#Chunking
import nltk
from nltk.corpus import state_union
import matplotlib 
from nltk.tokenize import PunktSentenceTokenizer

train_text = state_union.raw("2005-GWBush.txt")
sample_text = state_union.raw("2006-GWBush.txt")

custom_sent_tokenizer = PunktSentenceTokenizer(train_text)

tokenized = custom_sent_tokenizer.tokenize(sample_text)

def process_stat():
    try:
        for w in tokenized:
            words = nltk.word_tokenize(w)
            tags = nltk.pos_tag(words)

            chunkGram = r"""Chunk:  {<RB.?>*<VB.?>*<NNP><NN>?}"""

            chunkParser = nltk.RegexpParser(chunkGram)
            chunked = chunkParser.parse(tags)

            print(chunked)
  

文件“ ”,第26行

     

^
  SyntaxError:解析时出现意外的EOF

0 个答案:

没有答案